Earn the IEEE Reliability Society Certificate by attending 2 half day OR 1 full day tutorial out of the following 5 tutorials: Tutorial 5Tutorial 6Tutorial 8Tutorial 10 and Tutorial 11.

  Still deciding which tutorial to attend? Check the audio recordings of the tutorial expose held on 23rd Sept, 2016.

Tutorial 1 : Practical Software Availability Prediction in Telecommunication Industry

Presenter: Kazu Okumoto, Nokia  

This tutorial presents practical aspects of software availability prediction. Software content continues to evolve with additional code associated with new features and bug fixes throughout the software test phase. When a major change is made to the software, it generates a new wave of software defects. We will first review data requirements and then demonstrate that successive Non-Homogeneous Poisson Process (NHPP) exponential models can capture an entire defect trend precisely. The need for multiple curves is explained in terms of new code availability and test resources allocation such as testers, lab time and test cases. This approach helps confirm that software defect prediction should be performed based on test defect data from a final software product.We will then look beyond the defect prediction and address how to predict in-service software reliability and availability as we do for hardware reliability. Statistical confidence limits are also introduced to provide the accuracy of reliability prediction. The proposed approach will be illustrated with actual data from several releases of various software development projects. We have successfully implemented this approach for software availability assessment of key telecommunication products over several years.

Tutorial 2: Assessing Dependability with Software Fault Injection 

Presenter: Roberto Natella and Domenico Cotroneo, Università degli Studi di Napoli Federico   

Software failures have become one the major cause of accidents in critical systems. Unfortunately, this problem is exacerbated by the troubling trend towards greater software complexity and pervasiveness (which makes software issues unavoidable), and by the "fragility" of modern systems (which causes software issues to escalate into more serious accidents). Software Fault Injection increases awareness of how faulty software components can impact on a large, complex system, by deliberately introducing faults in an experimental setting. This approach allows to: identify weak points and possible propagation channels; to support the acquisition and certification of off-the-shelf software components, in compliance with dependability requirements; to tune and to evaluate fault-tolerance algorithms.

This tutorial will provide a methodological introduction to Software Fault Injection, and a broad overview of Software Fault Injection techniques and tools. The tutorial will include motivating examples and relevant applications of Software Fault Injection for the assessment and benchmarking of software dependability.

Tutorial 3: Mission-Critical Software Assurance Engineering ­ Beyond Testing, Bug Finders, Metrics, Reliability Analysis, and Formal Verification  

Presenter: Suresh Kothari, Iowa State University, Jeremías Sauceda, EnSoft Corp.,

Today, mission-critical software assurance engineering must encompass both safety and cyber-security. Critical missions, whether in defense, government, banking, or healthcare depend on ensuring that a system meets safety requirements, and it does not fail under cyber attack. Mobile, cloud and Internet of Things (IoT) have made software assurance integral to our everyday lives, whether it is a life-saving medical device, the integrity of the power grid, a soldier with a heads-up-display, or running a global supply chain. Cyberspace is so pervasive that the US Department of Defense has put cyberspace on par with land, sea, and air as a war-fighting domain, and the White House reports point to the urgent national need to shift the cybersecurity posture from defending computer networks to assuring critical missions.

We will present case studies distilled from our research findings through participation in DARPA cybersecurity programs, work with automotive and avionics companies, verification of the Linux kernel, and studies of the NIST test suites Juliet and SARD. The goal will be to help the participants develop a good understanding of: (a) the core of technical challenges for mission-critical software assurance, and (b) how and why techniques such as testing, bug finders, quality metrics, reliability analysis, and formal verification fall short in addressing these challenges. The tutorial will conclude with a discussion of two key ideas to address the challenges of mission-critical software assurance.  

Tutorial 4: Software Robustness Testing of Complex Telecommunications Solutions

Presenter: Vincent Sinclair and Abhaya Asthana, Bell Labs

We can define software robustness as the degree to which a system or component can function correctly in the presence of invalid inputs or stressful environmental conditions. While software testing typically does a very good job of testing the functional requirements of a solution, there is less focus on testing for software robustness. This results in software with significant robustness vulnerabilities escaping to the field, which can lead to service affecting outages.

The tutorial will be driven by a number of case studies which are based on extensive experience with the development, testing and delivery of complex telecommunications software solutions. We will present examples of typical software robustness defects in a telecommunications network. Using these examples as input, we will explore how to build a comprehensive software robustness test strategy and test plan. Firstly, we will identify how these defects typically enter the software, particularly when the end to end team is distributed across multiple sites and multiple time zones. We will then look in detail how a development and test team can build a very complete software robustness test plan to prevent such defects escaping. We will explore the critical and often overlooked need for input to the robustness test plan from the systems engineers, architects, designers and the customer facing support team. An overview of fault modelling of a specific telecoms solution will be provided. Having identified the areas to focus on for software robustness testing, we will explore how to build test cases to uncover robustness defects in the software, including some appropriate software fault injection techniques. Finally, we will explore ways to extend the typical stability testing to more aggressively stress the software to discover underlying robustness type faults.

Tutorial 5: Modern Web Applications' Reliability Engineering

Presenter: Karthik Pattabiraman, Univ of British Columbia

JavaScript is today the de-facto client-side programming language for modern web applications, and is extensively used in the client-side of web applications for interactivity and faster load times. For example, 97 of the top 100 Alexa websites use JavaScript code, often running into thousands of lines of code. However, JavaScript is notorious for its difficult-to-analyze constructs and "laissez-faire" programming style, which makes it challenging to build reliable web applications in JavaScript. This tutorial will present approaches to assess and improve the reliability of modern JavaScript-based web applications.

In the first part of the tutorial, we will present empirical studies on the reliability of modern web applications, through field data studies, bug databases and online fora such as StackOverflow. We will then proceed to explore automated tools and techniques for web applications’ testing, understanding and fault mitigation (i.e., repair). Finally, we will conclude with some tools for building robust client-side web applications, and discuss open issues and research problems.

Tutorial 6: Testing Reliable Software in an Agile Context: Benefits, Challenges and Solutions

Presenter: Sigrid Eldh and Kristoffer Ankarberg, Ericsson

Many industries are using and adapting to Agile processes with continues build and integration of software. Practices like test driven development, refactoring and test automation is now more in focus than ever. Ericsson has a history of driving efficient processes and was an early adopter to both Agile and Lean concepts, which is no simple task for large complex systems with high demands on performance and reliability. This new way of working and has made it possible to move into continues deployment as DevOps becoming more in focus. This tutorial will discuss hurdles, lessons learned, positive and negative consequences of such a shift. We will focus on Agile practices but from a quality and test/verification angle, but with a twist: We will run the workshop as a “agile” project in sprints, and we will also systematically collect data during this tutorial, of those who are willing. This means it will be an “active” participating workshop, not only lecturing. Our goal in this tutorial is to share our experiences, but also engage the audience in discussions, through on-line and working on questions that the audience think is important in this context. Results collected will be shared at the workshop.

All the tutorial attendees are advised to bring their laptops to the tutorial session for the hands-on part of the tutorial. 

Tutorial 7: Designing Survivability for Big Data Software-as-a-Service Systems

Presenter: Hari Ramasamy, Long Wang and Rick Harper, IBM

The tutorial will be organized as a half-day activity. First, we will introduce terminology, theory, concepts, and metrics in designing survivability against large-scale outages for software-as-a-service systems specialized in dealing with big data workloads. Second, we will catalog typical challenges involved in designing survivability for such systems. Analytics workloads have their special requirements from the underlying systems for high-speed processing of analytics operations, and we will discuss these special requirements. Thirdly, we will present example solution designs for effectively meeting those requirements. Finally, we will guide the participants through a hands-on exercise of architecting a resiliency solution to meet a sample set of real-world requirements. The authors have first-hand experience in building and delivering such systems, so all the material will be grounded in the lessons learned from those experiences.

Tutorial 8: Data Analytics for Software Reliability

Presenter: Veena Mendiratta and Catello Di Martino, Bell Labs

The application of data analytics methods and techniques to data collected under real workload conditions provides valuable information about the system state and for predicting anomalous behavior. Textual/numeric data and log files produced by applications, operating systems, networks, and other monitoring sources play a key role for assessing system reliability and resiliency properties. Practitioners, academia, and industry strongly recognize the inherent value of log data and network metadata. Data- driven evaluation deepens our understanding of the system dependability behavior, and enables stronger design and better monitoring strategies. The role of log files and data for measuring the dependability of production systems has been long recognized. Today, these studies are assuming particular relevance for failure analysis and prediction in industrial systems, networks, cloud and HPC systems; logs are the primary source of data available to gain insight on runtime issues in these systems. The understanding that can be gained from logs on today’s systems enables improved design and better monitoring and failure prediction strategies for future systems.

All tutorial attendees need to download the following software on their laptops for the hands-on part of the tutorial:

R  https://www.r-project.org/

RStudio  https://www.rstudio.com/

Install R and RStudio. After you open RStudio, install the following R packages: kohonen, dummies, ggplot2, sp, reshape2, RColorBrewer, shiny

Tutorial 9: Dependability Analysis in the Context of Component-Based System Architectures

Presenter: Mark Zeller and Kai Hoefig, Siemens

The importance of dependable software systems in many application domains of embedded systems, such as aerospace, railway, health care, automotive and industrial automation is continuously growing. Thus, along with the growing system complexity, also the need for dependability assessment as well as its effort is increasing drastically in order to guarantee the high quality demands in these application domains. This half-day tutorial provides a deep understanding why approaches that work fine for a component-based modeling approach of systems, are not applicable to the models that are currently used to analyze RAMS properties. This tutorial especially addresses attendees that work in the area of dependable systems, particularly from domains where an increased software complexity can be observed in recent years, such as aerospace, railway, health care, automotive and industrial automation.

Tutorial 10: Agile Root Cause Analysis

Presenter: Ram Chillarege, Chillarege Inc.

The tutorial on Orthogonal Defect Classification (ODC) provides the practicing engineer and manager a good overview of the technology, its benefits, practice and implementation. One must have a reasonable experience with the software development lifecycle, process improvement methods, tools, and practices, and appreciation of Agile Development methods. Knowledge of historical software development processes and principles are useful, but not necessary. ODC is a technology that extracts semantics from the software defect stream to provide insight into the development process and product. This tutorial covers:

  • ODC Concepts

  • ODC Classification and Information Extraction

  • How to gain 10x in Root Cause Analysis

  • How to tune up the Test Process using ODC

  • In-process Measurement and Prediction with ODC

  • Case Studies of ODC based Process Diagnosis

  • What is required to support ODC?

  • How does one plan an ODC Rollout ?

Tutorial 11: Use of Data Science and Measurement in Software Reliability Engineering

Presenter: Sunita Chulani, Cisco

There is an abundance of metrics and models in Software Reliability Engineering, But a significant challenge in the usage and deployment of these models, is to develop mathematically sound models that reflect the real world and can have an impact in engineering teams to improve their processes to enable them to produce great products.

In this tutorial, I will detail the experiences that I have had with building and implementing models that are practically deployed within engineering teams using different development best practices that are effective for waterfall, hybrid and agile processes. I will show how we link in-process measures (development/test) to customer experience and customer satisfaction, which in turn correlate to revenue. We will discuss the decision process involved in choosing the most valuable metrics, setting goals on those metrics, and controlling our processes using extremely practical approaches exhibiting the use of highly scientific models in practical industry projects. Both scientific and practical aspects need to unite to have an impact on our engineering environment. Most organizations have an abundance of data that can be harvested using practical data science techniques into industrial strength models.


Tutorial Expose Audio Recordings

Check the audio recording of the tutorial expose held on 23rd September, 2016 through Webex (in this order)

  1. Introduction to Tutorials
  2. Tutorial 5 Karthik Pattabiraman
  3. Tutorial 7 Hari Ramasamy, Long Wang and Rick Harper, IBM
  4. Tutorial 8  Veena Mendiratta and Catello Di Martino, Bell Labs
  5. Tutorial 11 Sunita Chulani
  6. Open Questions