243 Condition Monitoring jobs in Vietnam
Reliability Engineer

Posted 21 days ago
Job Viewed
Job Description
This key position in our Quality **Reliability Engineering** organization, based in Vietnam.
The ideal candidate will have a solid reliability and simulations background with process/manufacturing background in consumer electronics industry (electromechanical) and effective in supplier quality management with in-depth knowledge on reliability testing methodology and reliability analysis.
To qualify for this exciting opportunity, this candidate must possess effective communication, organizational, technical and documentation skills. You must function well in a fast-paced collaborative environment and be able to apply critical thinking and strong problem solving skills to complex production environment scenarios to ensure high availability.
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
**Responsibilities**
+ Candidate will be responsible for monitoring product performance in the field and will work closely with manufacturing partners and component vendors to perform failure analysis and drive corrective actions.
+ Provide reliability guidance to Contract manufacturers and suppliers for release to manufacture phase and lab qualifications.
+ Develop Suppliers to setup On-Going-Reliability test to monitoring mass production.
+ Work with China and Redmond Reliability teams to develop and to document reliability qualification plans for new products.
+ Managing multiple design qualification activities and development schedule to improve the quality of products.
+ Evaluate and Drive effectiveness of the reliability stresses or resolve reliability issues related to products.
+ Proactively drive root cause investigation of reliability failures and work with cross-functional teams for issues closures.
+ Participate in component vendor selection activity and drive component qualification activity for components that are critical and strategic to Microsoft product requirements.
+ Understanding of the technology, materials and failure mechanisms associated with major electronic and electro-mechanical components/materials.
+ Use knowledge of process capability for electronic component production as well as system-level performance requirements to establish Critical-to-Quality performance metrics.
+ 0-25% overseas travelling opportunity as needed.
**Qualifications**
**Required Qualifications:**
+ Master's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 2+ years technical engineering experience OR Bachelor's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 3+ years technical engineering experience OR 7+ years technical engineering experience.
+ Solid Experience in working with suppliers in setting up Reliability labs and run qualification plans during development and sustaining phase.
+ Familiar with all the various Environmental, Mechanical Reliability test methodologies in ASTM /IEEE Industry Standards and understand basics of.
+ Solid experience in hardware verification, PCBA and Box Build Assemblies process controls and quality controls.
+ Effective English communication skills, verbal and writing.
**Preferred Qualifications:**
+ Statistical analysis skill, familiar with tools as Minitab or Weibull.
+ Understand the PoF with good basic failure analysis knowledge.
+ DFMEA experience.
+ Effective communication and collaboration skills to work with people from a variety of technical backgrounds.
#W+DJOBS
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations ( .
Reliability Engineer
Posted today
Job Viewed
Job Description
(Mức lương: Thỏa thuận)
Work with cross-functional team to overall manage suppliers' quality and project development and management.
Take lead to improve suppliers‘ performance and develop key suppliers
New supplier audit, evaluation and training & qualification.
Conducting, Identifying and reporting on potential failures within a process
Designing new systems and performing predictive analysis
Planning performance evaluation assessments
Conduct Failure mode and effects analysis (FMEA), Reliability hazard analysis, Dynamic reliability block-diagram analysis, Fault tree analysis, Accelerated testing, Avoidance of single point of failure (SPOF)
Perform Root cause analysis and create action plan on corrective actions
Do the Functional analysis and functional failure analysis (e.g., function FMEA, FHA or FFA)
Conduct Operational hazard analysis
Creating and monitoring life cycle asset management plans
Develop the manufacturing & Test process / equipment / materials for FOL (Front Of Line) and EOL (End Of Line).
Provide technical leadership in resolving any process or test related issue;
Responsible for developing DOEs, Statistical Process Analysis, process specification, D/PFMEA and process control plan.
Process capability analysis, validation, improvement and qualification.
Inspection, measurement, tester validation and qualification through MSA.
**Chức vụ**: Nhân Viên/Chuyên Viên
**Hình thức làm việc**: Toàn thời gian
**Quyền lợi được hưởng**:
Interesting bonus scheme
Employee Benefit insurance
A great working environment where you are working with the best people in the industry
**Yêu cầu bằng cấp (tối thiểu)**: Đại Học
**Yêu cầu công việc**:
Bachelor degree in Electrical/Electronics/Electro-Mechanical or equivalent practical experience.
At least 6 years work experience in Quality/Reliability Engineering and manufacturing engineering
Experience in Electronic modules like smartphones, tablet, PCs, smart wearable device and its manufacturing process.
Good logic thinking, quality knowledge and process methodology.
Excellent communication and coordination skill.
Good English written and oral skill.
**Yêu cầu giới tính**: Nam/Nữ
**Ngành nghề**: Cơ Điện,Điện Tử
Đại Học
Không yêu cầu
Reliability Engineer , eero

Posted 7 days ago
Job Viewed
Job Description
The Role:
A Reliability Engineer who's passionate and takes great pride in launching high quality and reliable products into the consumer market. The position will collaborate with cross-functional team members to establish product design and performance validation test methodologies and performance specifications to ensure that product is ready for production. The ideal candidate will be responsible for system reliability testing, packaging reliability testing, accessories reliability testing, reliability calculations, statistical analysis, performance tests and field analysis of eero products from prototypes to mass production. You will partner with the Packaging Engineering, Accessories team, Product Management, Development Engineering, Material Sourcing, Manufacturing Engineering, Strategic Product Development, Manufacturing Partners and Component Suppliers to achieve key product quality, cost, and reliability goals. Specifically, this person will work with eero cross-functional engineers on new and sustaining product reliability tests creations, assessments and acceptance criteria, identify critical field issues and actively implement corrective and preventative actions with partnered CMs, JDMs and/or ODMs based in Asia.
What you'll do:
● Perform system reliability testing, packaging reliability testing, accessory's reliability testing, review testing reports and highlight the reliability results to the cross-functional team (Product Design, Hardware team, Packaging team, Design & Development, Product, Operations).
● Develop system, packaging and accessories reliability plans with goals and quantifiable results: ISTA, MTBF level, ALT, 85C/85RH etc. tests.
● Perform DFR (Design for Reliability Reviews) and DFMEA (Design Failure Mode Effects Analysis) reviews by partnering with Engineering and Manufacturers to achieve key reliability goals (i.e. design margin analysis, preferred parts, suppliers, component/system, alternative components or technologies).
● Execute reliability qualification plans by driving external labs and leveraging internal resources.
● Support system level products (routers) reliability testing, DOEs, and studying/developing new test cases.
● Verify suppliers' reliability calculations and tests at component level in partnership with Component, Supplier Quality and Supply Chain engineers.
● Write Engineering Verification Test plans, execute plans, and create test reports.
● Define a set of production reliability tests and methodologies (packaging, accessories and products), such as ORT, ESS, FMEA, DFX, etc. in order to ensure field reliable parts and products.
● Apply metrics for monitoring the field reliability performance and dynamically act on the findings with corrective actions, using best-in-class methodologies such as 8D, fish bone, 6 Sigma, DMAIC, FMEA, and SPC.
● Identify field trends and set up alerts using applications such as Weibull+ or JMP to perform sound analysis and predictions based on field data.
● Report on findings at core team meetings to reach consensus on the actions to be applied.
● Report critical issues and findings to executive leadership for directions and/or escalations.
● Analyze failures from field, production and qualification tests providing improvement suggestions, based on the failure mechanisms and root causes, to Develop Engineering, Manufacturers and Customer Support.
● Create a culture of continuous improvement at eero and inspire best practices by writing guidelines, providing feedback, solutions, applying innovative metrics and measurements, planning DOEs, and benchmarking the state of the art in comparable industries, technologies and companies.
Basic Qualifications
● Technical Degree (BSEE, BSME, BSCS, Physics, Industrial Engineering, other)
● 8+ years of combined experience in Packaging, Accessories and Product Reliability Engineering and Testing for New Product Introductions and Sustaining.
● 5~10 years of combined experience in consumer electronics manufacturing; experience with Sensors, RF and/or Wi-Fi based products will be a plus
● Experience with industry standards (ISTA, IEC, UL, ASTM, ANSI, TUV, ISO, IPC, MIL, etc.)
● Demonstrated excellent leadership, communication, interpersonal skills.
● Results driven, team player, proven ability to influence design teams and cross-company teams.
● Must have the ability to thrive in a fast-paced, team-oriented environment.
● Familiarity with documentation required for manufacturing assembly & test of RF systems, particularly BOMs, Schematics, Block Diagrams, Release Notes, System Requirement Documents, Assembly Instructions, MFG Test Instructions.
● Ability to work independently on testing & diagnosis of system failures down to the board level and component level product, including test, debug and repair. Work with Design Engineers to resolve issues via reliability test results.
● Strong analytical, technical, problem-solving skills.
● Strong verbal & written communication skills, excellent interpersonal skills, ability to work in a variety of locations (office, external labs, customer sites, contract manufacturers)
Preferred Qualifications
● Familiarity with various operating systems (Mac, Windows, Linux) both GUI and Command Line
● Familiarity with various programming languages and tools (LabVIEW, C/C++/C#, Excel VBA, Python, Scripting, HTML/XML, scripts, batch files, Weibull+, JMP) and test equipment such as stain gauge, environmental chambers, impact and vibration testers, power supplies, salt spray and UV testers, etc.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
Senior Site Reliability Engineer
Posted 2 days ago
Job Viewed
Job Description
- Design, implement, and maintain highly available and scalable production systems.
- Develop and manage infrastructure automation using tools like Terraform, Ansible, or Chef.
- Implement and manage container orchestration platforms (e.g., Kubernetes, Docker Swarm).
- Set up and maintain robust monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack).
- Lead incident response efforts, troubleshoot complex issues, and conduct thorough post-mortems.
- Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
- Automate operational tasks and reduce manual intervention (toil reduction).
- Collaborate with development teams to ensure the reliability and performance of new features and services.
- Participate in on-call rotation to provide 24/7 support for critical systems.
- Contribute to capacity planning and performance tuning.
- Ensure security best practices are implemented across the infrastructure.
- Document system architecture, operational procedures, and incident reports.
- Bachelor's degree in Computer Science, Engineering, or a related field; Master's degree is a plus.
- 5+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering.
- Proven experience with cloud platforms such as AWS, Azure, or GCP.
- Expertise in scripting languages (e.g., Python, Go, Bash).
- Strong understanding of networking concepts (TCP/IP, HTTP, DNS, load balancing).
- Experience with CI/CD tools and practices (e.g., Jenkins, GitLab CI).
- Familiarity with containerization technologies (Docker, Kubernetes).
- Excellent troubleshooting, problem-solving, and analytical skills.
- Strong communication and collaboration skills, with the ability to explain technical concepts clearly.
- Experience with databases (SQL and NoSQL) and their administration.
- On-call experience and ability to work under pressure.
Senior Site Reliability Engineer
Posted today
Job Viewed
Job Description
Optimizely fosters an inclusive and diverse culture with a global team of 1500+ people spread
across the US, Europe, Dubai, Australia, Singapore, Bangladesh, and Vietnam. Our unique work environment focuses on flexibility, trust, teamwork, diversity, and moving fast.
We genuinely believe that our people make all the difference, and once we find the best talent, we go out of our way to nurture them. If you are looking to work on the next generation of digital technologies in a fast-paced and growing environment with industry leaders, Optimizely is the place for you!
**Introduction**:
**Responsibilities**:
- Define a roadmap for all engineering teams to utilize fully automated, self-service, highly scalable, cost-efficient, observable, auditable and reliable infrastructure services as standard practice.
- Drive the execution of this roadmap across the engineering organization, collaborating with SREs and senior engineers across engineering while also performing hands-on work on the most critical challenges.
- Provide expert technical guidance and ongoing engineering design review to teams planning and implementing large migrations, service-oriented architecture, broad architectural shifts, and capacity growth.
- Build a metrics-driven operational culture standardizing our practices for SLO definition and review as well as for logging, monitoring, alerting, and on-call practices.
- Make iterative improvements to blameless incident management processes, root cause analyses, outage prevention, and service recovery strategies across the engineering organization.
- Partner closely with Security, Quality, and Product teams to achieve high priority security, privacy, compliance, reliability, and business-continuity objectives on our overall roadmap.
- Propose and drive large improvements to production systems to achieve a significant impact to our business and engineering teams.
- Mentor and coach engineers to be curious and effective at discovering and solving technical challenges.
**Knowledge and Experience**:
- You have proven experience (6+ years) demonstrating hands-on technical leadership and business impact in combining software engineering skills with systems engineering skills to solve complex automation and reliability challenges.
- You have deep technical experience with various cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture.
- You have the skills to implement load, stress, performance, and reliability testing standards at scale to improve service, platform, and infrastructure resiliency.
- You promote openness, diversity of opinions, and inclusive discussions at all times to evaluate a wide variety of ideas and perspectives in solving challenging problems.
- You demonstrate clear decision making and good trade-offs in complex situations comprising multiple opinions, needs, teams, technologies, cloud providers, and architectural settings.
- You communicate effectively with stakeholders ranging from executives to junior engineers across the breadth and depth of the engineering organization.
- You exemplify high accountability, integrity, and resilience to maintain focus on both big-picture goals and milestones to get there.
- You enable the engineering organization to innovate and deliver with greater speed and safety.
**Education**:
BS CS or equivalent industry experience
**Competencies**:
- Displaying Technical Expertise- Critical Thinking- Testing and Troubleshooting- Demonstrating Initiative- Utilizing Feedback**About Us**:
- 5 working days /week with flexible working time and no overtime.
- Annual luxury Kick-off vacation.
- International, professional, creative working environment and talented teams
- Onsite opportunities in Europe and US.
- Common cultural-sportive
- art Clubs and activities, sponsored and/or supported by the Company (Ex: Football, GYM, Swimming, Guitar, English ).
- Powerful workstation: Core i7-9700, 16-32 GB RAM, 02 x QHD 2560x1440 monitors (2K resolution).
- 100% official salary during the probation period, 13th month salary, annual salary raises.
- 12 days of annual leave and 3 days of company holidays (New Year eve 31/12, Juneteenth day 18/6, Work Anniversary)
- Up to 03 extra paid-leave days per year.
- Social, Health and Unemployed Insurance are based on 100% Gross salary and fully paid by Company.
- Extra bonus at $ 60 per special occasions (Birthday, Labor Day, National Day, Solar New year, Lunar New Year).
- Lunch allowance at $30 per month.
Remote Lead Site Reliability Engineer
Posted today
Job Viewed
Job Description
Responsibilities:
- Design, build, and maintain scalable and reliable production systems.
- Develop and implement automation strategies for deployment, monitoring, and incident response.
- Identify and address performance bottlenecks and proactively mitigate risks.
- Lead troubleshooting efforts and conduct post-mortems for incidents.
- Collaborate with software engineers to ensure reliability is designed into new features.
- Develop and maintain system monitoring, alerting, and logging infrastructure.
- Manage CI/CD pipelines and optimize deployment processes.
- Mentor and guide junior SRE team members.
- Contribute to architectural discussions and technology selection.
- Ensure system security and compliance with industry standards.
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- 5+ years of experience in Site Reliability Engineering, DevOps, or System Administration.
- Expertise in cloud platforms such as AWS, Azure, or GCP.
- Proficiency in at least one scripting language (e.g., Python, Go, Bash).
- Experience with containerization technologies like Docker and Kubernetes.
- Strong understanding of networking concepts (TCP/IP, DNS, HTTP).
- Experience with infrastructure-as-code tools (e.g., Terraform, Ansible).
- Proven ability to diagnose and resolve complex system issues.
- Excellent communication and collaboration skills for remote teamwork.
- Experience with monitoring tools (e.g., Prometheus, Grafana, ELK stack).
Senior Site Reliability Engineer (Remote)
Posted today
Job Viewed
Job Description
Responsibilities:
- Design, implement, and manage highly available and scalable systems.
- Develop and maintain infrastructure automation tools and scripts.
- Build and manage CI/CD pipelines for efficient software deployment.
- Implement and optimize monitoring, alerting, and logging systems.
- Lead incident response and conduct post-mortems to prevent future issues.
- Collaborate with development teams to ensure system reliability and performance.
- Conduct capacity planning and performance tuning.
- Automate operational tasks and reduce manual toil.
- Contribute to the design and architecture of new systems and features.
- Mentor junior SREs and share best practices.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in Site Reliability Engineering, DevOps, or a similar role.
- Strong experience with cloud platforms such as AWS, Azure, or GCP.
- Proficiency in scripting and programming languages like Python, Go, or Java.
- Experience with containerization technologies (Docker, Kubernetes).
- Expertise in infrastructure as code (IaC) tools (Terraform, Ansible).
- Knowledge of monitoring tools (Prometheus, Grafana, Datadog).
- Strong understanding of networking, operating systems, and distributed systems.
- Excellent problem-solving, analytical, and debugging skills.
- Ability to work effectively in a remote team and manage complex projects.
Be The First To Know
About the latest Condition monitoring Jobs in Vietnam !
Senior Site Reliability Engineer (SRE)
Posted today
Job Viewed
Job Description
Key Responsibilities:
- Design, implement, and manage scalable and reliable cloud-based infrastructure (e.g., AWS, Azure, GCP).
- Develop and maintain automation tools and scripts for deployment, monitoring, and incident management.
- Implement and enforce best practices for system monitoring, alerting, and logging.
- Participate in on-call rotation to respond to and resolve production incidents.
- Conduct root cause analysis for production issues and implement preventative measures.
- Collaborate with development teams to improve application reliability and performance throughout the software development lifecycle.
- Manage and optimize CI/CD pipelines for efficient and safe software deployments.
- Develop and maintain infrastructure as code (IaC) using tools like Terraform or Ansible.
- Contribute to capacity planning and performance tuning of systems.
- Document system architecture, operational procedures, and incident post-mortems.
- Stay current with emerging technologies and industry best practices in SRE and cloud computing.
- Mentor junior engineers and promote a culture of reliability and operational excellence.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- Minimum of 6 years of experience in system administration, DevOps, or Site Reliability Engineering.
- Proficiency with cloud platforms (AWS, Azure, or GCP) and containerization technologies (Docker, Kubernetes).
- Strong scripting skills (e.g., Python, Bash, Go).
- Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging systems (e.g., ELK stack).
- Familiarity with CI/CD tools and practices (e.g., Jenkins, GitLab CI).
- Solid understanding of networking concepts (TCP/IP, DNS, HTTP).
- Experience with configuration management tools (e.g., Ansible, Chef, Puppet).
- Ability to work independently and manage priorities in a remote, fast-paced environment.
Remote Senior Site Reliability Engineer
Posted 2 days ago
Job Viewed
Job Description
Key Responsibilities:
- Design, build, and maintain reliable, scalable, and high-performance infrastructure.
- Develop and implement automation for operational tasks, deployments, and incident response.
- Monitor system health, performance, and availability, and establish effective alerting mechanisms.
- Participate in on-call rotations and manage production incidents.
- Conduct root cause analysis for production issues and implement preventative measures.
- Manage cloud infrastructure resources and optimize for cost and performance.
- Collaborate with software engineering teams to improve the reliability and deployability of applications.
- Develop and maintain infrastructure-as-code using tools like Terraform or Ansible.
- Perform capacity planning and performance tuning.
- Contribute to disaster recovery planning and testing.
- Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
- 5+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering.
- Proven experience with cloud platforms such as AWS, GCP, or Azure.
- Strong proficiency in at least one scripting language (e.g., Python, Go, Bash).
- Hands-on experience with containerization technologies like Docker and orchestration tools like Kubernetes.
- Solid understanding of networking concepts (TCP/IP, DNS, HTTP, load balancing).
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Familiarity with infrastructure-as-code tools (e.g., Terraform, Ansible, Chef, Puppet).
- Experience in building and managing CI/CD pipelines.
- Excellent problem-solving skills and the ability to work under pressure.
- Strong communication and collaboration skills, especially in a remote environment.
Principal Site Reliability Engineer (Zalopay
Posted today
Job Viewed
Job Description
- Eliminating toil by automation across all the layers - infrastructure provisioning, configuration management, deployment, testing, and operation on premise and public clouds (Google Cloud and AWS)
- Working on retooling our infrastructure to provide an agile, cloud based foundation that provides common infrastructure management and automation framework.
- Interfacing directly with senior staff members within the organization to discuss and assess compliance with IT policies, standards and procedures, suggest opportunities for improvement, and report on the status of specific. Work with development teams throughout the software life cycle ensuring sustainable software releases.
- Practicing sustainable incident response and blameless postmortems
- Help to build methodology to manage infrastructure and platform cost
- Train SRE junior members
- Manage small SRE team (4-6 members) to drive automation, scalability, high availability and performance of ZaloPay
**Yêu cầu**:
- Bachelor’s degree with five or more years of work experience.
- Six or more years of SRE relevant work experience.
- Experience in Systems Architecture, in-depth knowledge on SRE, IT Operations, Cloud, Coding and Scripting experience with Golang, Java, Python and automation tool: Terraform, Ansible,
- Strong experience with Google, AWS cloud environments, with working knowledge in standard cloud services, features and tool, with Certification in appropriate areas.
- Strong experience with automation provisioning dependency software on premises.
- Have experience building Disaster recovery solution is preferred
**Preferred**
- Five or more years of experience working on middle technologies like Kafka/ RabbitMQ, Springboot, REDIS, Elasticsearch MySQL, ETCD.
- Automation experience and ability to code or script at an advance level.
- Experience in Cloud & Container platform Strategies, Design, Architecture and Migration.
- Experience with designing and implementing CI/CD DevOps solutions using Jenkins pipelines using Python, Git, Shell, YAML, Kubernetes and Docker.
- Configuration Management experience with Chef, Puppet, Ansible or Python.
- Experience serving as both a mentor and advocate for your team.
- Experience performing analytics on previous incidents and usage patterns to better predict issues and take proactive actions.