Article

Discovery of Resource-Oriented Transition Systems for Yield Enhancement in Semiconductor Manufacturing

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In semiconductor manufacturing, data-driven methodologies have enabled the resolution of various issues, particularly yield management and enhancement. Yield, one of the crucial key performance indicators in semiconductor manufacturing, is mostly affected by production resources, i.e., equipment involved in the process. There is a lot of research on finding the correlation between yield and the status of resources. However, in general, multiple resources are engaged in production processes, which may cause multicollinearity among resources. Therefore, it is important to discover resource paths that are positively or negatively associated with yield. This paper proposes a systematic methodology for discovering a resource-oriented transition system model in a semiconductor manufacturing process to identify resource paths resulting in high and low yield. The proposed method is based on the model-based analysis (i.e., finite state machine mining) in process mining and statistical analyses. We conducted an empirical study with real-life data from one of the leading semiconductor manufacturing companies to validate the proposed approach.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Incomplete events usually refer to events having missing values or missing attributes. Incomplete events include missing case id [50], missing timestamps [23,32,63], and missing activities [20,23], missing other attribute values that are relevant to this analysis [31]. ...
... The incompleteness of a case is usually described as cases that are not completed or do not represent the end-to-end process. It means that the cases lack some events, for example, "remove any record that may create only one event per case as it will not depict the sequence of activities and hinder the performance analysis of the model" [67] and "removing cases that did not cover the whole steps" [20]. ...
... Filtering redundant data Only two papers mentioned redundant data [19,20]. In [20], redundant events were included in data error: "we conducted some data preprocessing, including handling data error (e.g., removing redundant events and eliminating multiple yield values)", while there was no further definition and explanation in [19]. ...
... In suspicious machine selection in SP-MMP, individual effect and joint effect should be considered [8,16,17]. The individual effect of a machine is a quality loss of product due to the machine's fault. ...
... This aspect is discussed in Section 4.1. [20] Binary regression tree [21] Association rule mining 4 [17] b Retrospective design of experiment [16] c Transition system model [22] Frequent sequence pattern mining Nominal feature Machine sequence patterns [23] Association rule mining [8] Sequence pattern mining (Longest Common Subsequence, TEIRESIAS), decision tree classifier [24] Sequence pattern mining (Iterated Function System) [25] Association rule mining [26] Association rule mining [14] Frequent sequence pattern mining [27] LASSO [7] Association rule mining b improved a . c suggested both selecting suspicious machines and MSPs. ...
... Then, the result is presented to the engineers. [16] employed a similar framework, however it conducted the normality test on yields of wafers. When the test is accepted, it employed ANOVA to select critical process stages and a linear contrast test to select suspicious machine. ...
Conference Paper
Full-text available
Serial-Parallel multistage manufacturing process (SP-MMP) consists of multiple consecutive process stages, each of which has several alternative machines. Although the function of the machines in a stage is identical, they are not the same in actual status. A faulty machine has a direct and negative impact on the quality of products. Diagnosis of the machines in an SP-MMP is required to detect the faulty machines, but diagnosis is a cost-and labor-intensive task. Thus, suspicious machines, which are suspected to be faulty machines, are first selected, and then they are diagnosed in a production line. In order to select suspicious machines, various studies employed production log data, which record the sequence of operating machines throughout the entire process stages for each product. This study provides a literature review on suspicious machine selection methods using production log data with a focus on semiconductor industry. The reviewed articles are classified into three groups by two dimensions, namely, type of quality feature and relationship analysis results. Based on the review, the status of current research, limitations, and future research directions are suggested.
... The addition of attributes to the log allows to mine additional perspectives. For instance, the yield of a particular process (e.g., semiconductor manufacturing) can be recorded in the log and used to derive high-yield paths in the production flow (Cho et al., 2021). (Dörgo et al., 2018) uses event logs from a coke refinery plant to retrieve the set of actions that operators perform frequently in similar situations. ...
Article
Full-text available
Recently, production plants have become very complex environments, in which the final output is the result of the favourable interplay of several processes. Successful production management strictly depends on the ability to grasp the current disposition of both physical and managerial processes. To achieve this goal, the use of structured data-based methods has proved to be very effective. Yet, literature lacks successful applications, especially regarding production support processes (e.g., order acquisition, procure-to-pay), which are directly connected to the overall system performance. This work proposes an approach to enabling automated mapping and controlling of production support processes starting from available datasets. The limitations of existing methodologies are addressed by exploiting the combined application of two process mining algorithms: heuristic and inductive miner. The managerial implications are described within the application to a real case study, in which the main phases of the procure-to-pay process of a manufacturing company are identified and analysed automatically. The proposed approach proves its effectiveness in the context of application. The numerical results demonstrate that process mining can effectively identify improvements not only to physical processes but also to information flows and support processes that are crucial for guaranteeing the prosperity of an enterprise.
... Further, the addition of attributes to the log allows to mine additional perspectives. For instance, the yield of a particular process (e.g., semiconductor manufacturing) can be recorded in the log and used to derive high-yield paths in the production flow [105]. Another attribute of interest is cost. ...
Thesis
Full-text available
The latest developments in industry have involved the deployment of digital twins for both long and short term decision-making, such as supply chain management and production planning and control. The ability to take appropriate decisions online is strongly based on the assumption that digital models are properly aligned with the real system at any time. As modern production environments are frequently subject to disruptions and modifications, the development of digital twins of manufacturing systems cannot rely solely on manual efforts. Industry 4.0 has contributed to the rise of new technologies for data acquisition, storing and communication, allowing for the knowledge of the shop floor status at anytime. If a model could be generated from the available data in a manufacturing system, the development phase may be significantly shortened. However, practical implementations of automated model generation approaches remain scarce. It is also true that automatically built representations may be excessively accurate and describe activities that are not significant for estimating the system performance. Hence, the generation of models with an appropriate level of detail can avoid useless efforts and long computation times, while allowing for easier understanding and re-usability. This research focuses on the development and adoption of automated model generation techniques for obtaining simulation-based digital models starting from the data logs of manufacturing systems, together with methods to adjust the models toward a desired level of detail. The properties and parameters of the manufacturing system, such as buffer sizes, are estimated from data through inference algorithms. The system properties are also used in a model tuning approach, which generates an adjusted model starting from the available knowledge and the user requirements in terms of complexity (e.g., number of stations). In addition, a lab-scale environment has been built with the aim to test decision-making frameworks based on digital twins within a realistic data infrastructure. The experimental results prove the effectiveness of the proposed methodology in generating proper digital models that can correctly estimate the performance of a manufacturing system. The model generation and tuning method can positively contribute to real-time simulation. Indeed, its application within an online framework of production planning and control allows for adapting simulation models to the real system, potentially at any time a modification occurs. This way, decisions taken online are guaranteed to be referring to the current state of the factory. Thanks to this research, manufacturing enterprises will be able to reach a higher production flexibility, together with higher responsiveness to technological changes and market-demand fluctuations.
... In many domains such as manufacturing, healthcare, and service, performances of processes are often highly related to the resources, i.e., process participants [6]. For instance, in manufacturing, the machines involved in the production have huge influences on the quality of final products [7]. Moreover, in healthcare, each medical staff member has a different level of proficiency, resulting in differences in patient satisfaction [8]. ...
Article
Full-text available
Context-aware process mining aims at extending a contemporary approach with process contexts for realistic process modeling. Regarding this discipline, there have been several attempts to combine process discovery and predictive process modeling and context information, e.g., time and cost. The focus of this paper is to develop a new method for deriving a quality-aware resource model. It first generates a resource-oriented transition system and identifies the quality-based superior and inferior cases. The quality-aware resource model is constructed by integrating these two results, and we also propose a model simplification method based on statistical analyses for better resource model visualization. This paper includes tooling support for our method, and one of the case studies on a semiconductor manufacturing process is presented to validate the usefulness of the proposed approach. We expect our work is practically applicable to a range of fields, including manufacturing and healthcare systems.
Chapter
The stages of the semiconductor production chain range from design and production to field monitoring and customer service. Failure analysis of products on the market is essential to provide feedback for production and in- crease process quality. This paper presents an analytical methodology based on customer semiconductor failure data, which integrates a specific system to manage these failures (the Sigequalis) and adds efforts to control the manufacturing process, extending analytical capabilities to the final stages of the production chain. It includes an approach for analyzing reliable offenders and a key indicator, whose monitoring reinforces the use of 8D methodology for corrective actions of nonconformities. Such analyses allow for identifying batches of devices with problems, directing actions that may anticipate the occurrence of problems for customers, and providing feedback to the manufacturing process. The methodology and its corresponding tool combine and expand the information collected from customers, enabling a big picture of possible offending elements in order to act correctively and predictively, improving the quality of semiconductor production.KeywordsQuality analysisData intelligenceSemiconductor manufacturing
Article
Full-text available
Stream of variation (SoV) model is an effective tool, which can describe the dimensional variation and propagation for multistage machining processes. Compared with traditional single-stage error models considering errors from a single machining stage only, SoV model can depict the complicated interactions between different errors at different stages. This paper reviews three major linearized SoV modeling methods for multistage machining processes based on the literature published over the last two decades. These three linearized SoV modeling methods are based on differential motion vectors, equivalent fixture error, and kinematic analysis, respectively. Each method has its corresponding advantages and disadvantages. The model using differential motion vector (DMV) concept from robotics incorporates fixture-, datum-, and machining-induced variations in the multistage variation propagation for orthogonal 3-2-1 fixturing layout while the primary datum deviation is currently overlooked. The kinematic analysis method can address general fixture layouts rather than being limited to orthogonal 3-2-1 fixture layouts. The variation propagation model using the equivalent fixture error concept can directly model the process physics regarding how fixture, datum, and machine tool errors generate the same pattern on the features of the workpiece. The results of these three models with respect to an example are also given to make a comparison. Finally, a perspective overview of the future research about SoV methods is presented.
Article
Full-text available
Today, in an Industry 4.0 factory, machines are connected as a collaborative community. Such evolution requires the utilization of advance- prediction tools, so that data can be systematically processed into information to explain uncertainties, and thereby make more “informed” decisions. Cyber-Physical System-based manufacturing and service innovations are two inevitable trends and challenges for manufacturing industries. This paper addresses the trends of manufacturing service transformation in big data environment, as well as the readiness of smart predictive informatics tools to manage big data, thereby achieving transparency and productivity.
Article
Full-text available
This paper investigates learning causal relationships from the extensive datasets that are becoming increasingly available in manufacturing systems. A causal modeling approach is proposed to improve an existing causal discovery algorithm by integrating manufacturing domain knowledge with the algorithm. The approach is demonstrated by discovering the causal relationships among the product quality and process variables in a rolling process. When allied with engineering interpretations, the results can be used to facilitate rolling process control.
Article
Full-text available
Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS.
Article
Full-text available
In this paper, a state space modeling approach is developed for the dimensional control of sheet metal assembly processes. In this study, a 3-2-1 scheme is assumed for the sheet metal assembly. Several key concepts, such as tooling locating error, part accumulative error, and re-orientation error, are defined. The inherent relationships among these error components are developed. Those relationships finally lead to a state space model which describes the variation propagation throughout the assembly process. An observation equation is also developed to represent the relationship between the observation vector (the in-line OCMM measurement information) and the state vector (the part accumulative error). Potential usage of the developed model is discussed in the paper.
Article
Full-text available
Within the complex and competitive semiconductor manufacturing industry, lot cycle time (CT) remains one of the key performance indicators. Its reduction is of strategic importance as it contributes to cost decreasing, time-to-market shortening, faster fault detection, achieving throughput targets, and improving production-resource scheduling. To reduce CT, we suggest and investigate a data-driven approach that identifies key factors and predicts their impact on CT. In our novel approach, we first identify the most influential factors using conditional mutual information maximization, and then apply the selective naive Bayesian classifier (SNBC) for further selection of a minimal, most discriminative key-factor set for CT prediction. Applied to a data set representing a simulated fab, our SNBC-based approach improves the accuracy of CT prediction in nearly 40% while narrowing the list of factors from 182 to 20. It shows comparable accuracy to those of other machine learning and statistical models, such as a decision tree, a neural network, and multinomial logistic regression. Compared to them, our approach also demonstrates simplicity and interpretability, as well as speedy and efficient model training. This approach could be implemented relatively easily in the fab promoting new insights to the process of wafer fabrication.
Article
Full-text available
Process mining allows for the automated discovery of process models from event logs. These models provide insights and enable various types of model-based analysis. This paper demonstrates that the discovered process models can be extended with information to predict the completion time of running instances. There are many scenarios where it is useful to have reliable time predictions. For example, when a customer phones her insurance company for information about her insurance claim, she can be given an estimate for the remaining processing time. In order to do this, we provide a configurable approach to construct a process model, augment this model with time information learned from earlier instances, and use this to predict e.g. the completion time. To provide meaningful time predictions we use a configurable set of abstractions that allow for a good balance between "overfitting" and "underfitting". The approach has been implemented in ProM and through several experiments using real-life event logs we demonstrate its applicability.
Article
Full-text available
Today, machining systems are complex multistation manufacturing systems that involve a large number of machining operations and several locating datum changes. Dimensional errors introduced at each machining operation get transformed and cause the occurrence of new errors as the workplace propagates through the machining system. The appropriate choice of measurements in such a complex system is crucial for the subsequent successful identification of the root causes of machining errors hidden in dimensional measurements of the workplace. In order to facilitate this measurement selection process, methods for quantitative characterization of measurement schemes must be developed. This problem of quantitative measurement characterization referred to as the measurement scheme analysis problem is dealt with in this paper. The measurement scheme analysis is accomplished through characterization of the maximal achievable accuracy of estimation of process-level parameters based on the measurements in a given measurement scheme. The stream of variation methodology is employed to establish a connection between the process-level parameters and measured product quality. Both the Bayesian and non-Bayesian assumptions in the estimation are considered and several analytical properties are derived. The properties of the newly derived measurement scheme analysis methods are demonstrated in measurement scheme characterization in the multistation machining system used for machining of an automotive cylinder head.
Article
Full-text available
The characteristics of geometrical process control (GPC) are discussed. GPC combines the three key plant applications of process control, production control and alarm management. The mathematical basis of GPC is the use of n-dimensional geometry together with the Inselberg's co-ordinate transformation that makes it possible to see a multi-variable graph containing variables and different observations in a single picture. The main advantage of GPS is that the methods are simple to apply and do not require advanced mathematics, algebra, or equations.
Article
Full-text available
In this paper, a state space model is developed to describe the dimensional variation propagation of multistage machining processes. A complicated machining system usually contains multiple stages. When the workpiece passes through multiple stages, machining errors at each stage will be accumulated and transformed onto the workpiece. Differential motion vector, a concept from the robotics field, is used in this model as the state vector to represent the geometric deviation of the workpiece. The deviation accumulation and transformation are quantitatively described by the state transition in the state space model. A systematic procedure that builds the model is presented and an experimental validation is also conducted. The validation result is satisfactory. This model has great potential to be applied to fault diagnosis and process design evaluation for complicated machining processes.
Article
Full-text available
Empirical models based on real-time equipment signals are used to predict the outcome (e.g., etch rates and uniformity) of each wafer during and after plasma processing. Three regression and one neural network modeling methods were investigated. The models are verified on data collected several weeks after the initial experiment, demonstrating that the models built with real-time data survive small changes in the machine due to normal operation and maintenance. The predictive capability can be used to assess the quality of the wafers after processing, thereby ensuring that only wafers worth processing continue down the fabrication line. Future applications include real-time evaluation of wafer features and economical run-to-run control
Article
Big data analytics have been employed to extract useful information and derive effective manufacturing intelligence for yield management in semiconductor manufacturing that is one of the most complex manufacturing processes due to tightly constrained production processes, reentrant process flows, sophisticated equipment, volatile demands, and complicated product mix. Indeed, the increasing adoption of multimode sensors, intelligent equipment, and robotics have enabled the Internet of Things (IOT) and big data analytics for semiconductor manufacturing. Although the processing tool, chamber set, and recipe are selected according to product design and previous experiences, domain knowledge has become less efficient for defect diagnosis and fault detection. To fill the gaps, this study aims to develop a framework based on Bayesian inference and Gibbs sampling to investigate the intricate semiconductor manufacturing data for fault detection to empower intelligent manufacturing. In addition, Cohen's kappa coefficient was used to eliminate the influence of extraneous variables. The proposed approach was validated through an empirical study and simulation. The results have shown the practical viability of the proposed approach.
Article
Manufacturing has evolved and become more automated, computerised and complex. In this paper, the origin, current status and the future developments in manufacturing are disused. Smart manufacturing is an emerging form of production integrating manufacturing assets of today and tomorrow with sensors, computing platforms, communication technology, control, simulation, data intensive modelling and predictive engineering. It utilises the concepts of cyber-physical systems spearheaded by the internet of things, cloud computing, service-oriented computing, artificial intelligence and data science. Once implemented, these concepts and technologies would make smart manufacturing the hallmark of the next industrial revolution. The essence of smart manufacturing is captured in six pillars, manufacturing technology and processes, materials, data, predictive engineering, sustainability and resource sharing and networking. Material handling and supply chains have been an integral part of manufacturing. The anticipated developments in material handling and transportation and their integration with manufacturing driven by sustainability, shared services and service quality and are outlined. The future trends in smart manufacturing are captured in ten conjectures ranging from manufacturing digitisation and material-product-process phenomenon to enterprise dichotomy and standardisation.
Article
In semiconductor manufacturing, the multilayer overlay lithography process is a typical multistage manufacturing process; one of the key factors that restrict the reliability and yield of integrated circuit chips is overlay error between the layers. To effectively control overlay error, an accurate error model that can present the introduction, accumulation and propagation of multilayer overlay error is indispensable. On the basis of the existing original physical model, a model for multilayer overlay error based on the state space modeling method is proposed in this study. The model can provide information on the wafer coordinates and field coordinates by a coefficient matrix. The model was applied to 810 groups of real data collected from a wafer manufacturing plant for empirical validation. The test results demonstrate that the overlay error and coordinates that are predicted by the state space model work well in tracking the variation in the actual measured values; the rates of target hitting and the R-square values of different sampling variables are very close to one.
Article
This paper describes the development and the actual utilization of fab-wide fault detection and classification (FDC) for the advanced semiconductor manufacturing using big data. In the fab-wide FDC, the collection of equipment's big data for the FDC judgment is required; hence, we developed the equipment monitoring system that handles the data in a superior method in high speed and in real time. We succeeded in stopping equipment and lots automatically when the equipment was detected as fault condition. In addition, we developed the environment that enables immediate data collection for analysis by the data aggregation and merging functions, which extracts keys correlating to yield from the equipment's parameter. Furthermore, we succeeded in development of the high-speed and high-accuracy process control system that implemented virtual metrology and the run-to-run function for the purpose to reduce process variation.
Article
To maintain competitive advantages, semiconductor industry has strived for continuous technology migrations and quick response to yield excursion. As wafer fabrication has been increasingly complicated in nano technologies, many factors including recipe, process, tool, and chamber with the multicollinearity affect the yield that are hard to detect and interpret. Although design of experiment (DOE) is a cost effective approach to consider multiple factors simultaneously, it is difficult to follow the design to conduct experiments in real settings. Alternatively, data mining has been widely applied to extract potential useful patterns for manufacturing intelligence. However, because hundreds of factors must be considered simultaneously to accurately characterize the yield performance of newly released technology and tools for diagnosis, data mining requires tremendous time for analysis and often generates too many patterns that are hard to be interpreted by domain experts. To address the needs in real settings, this study aims to develop a retrospective DOE data mining that matches potential designs with a huge amount of data automatically collected in semiconductor manufacturing to enable effective and meaningful knowledge extraction from the data. DOE can detect high-order interactions and show how interconnected factors respond to a wide range of values. To validate the proposed approach, an empirical study was conducted in a semiconductor manufacturing company in Taiwan and the results demonstrated its practical viability.
Article
Process monitoring and profile analysis are crucial in detecting various abnormal events in semiconductor manufacturing, which consists of highly complex, interrelated, and lengthy wafer fabrication processes for yield enhancement and quality control. To address real requirements, this study aims to develop a framework for semiconductor fault detection and classification (FDC) to monitor and analyze wafer fabrication profile data from a large number of correlated process variables to eliminate the cause of the faults and thus reduce abnormal yield loss. Multi-way principal component analysis and data mining are used to construct the model to detect faults and to derive the rules for fault classification. An empirical study was conducted in a leading semiconductor company in Taiwan to validate the model. Use of the proposed framework can effectively detect abnormal wafers based on a controlled limit and the derived simple rules. The extracted information can be used to aid fault diagnosis and process recovery. The proposed solution has been implemented in the semiconductor company. This has simplified the monitoring process in the FDC system through the fewer key variables. The results demonstrate the practical viability of the proposed approach.
Article
A survey revealed that researchers still seem to encounter difficulties to cope with outliers. Detecting outliers by determining an interval spanning over the mean plus/minus three standard deviations remains a common practice. However, since both the mean and the standard deviation are particularly sensitive to outliers, this method is problematic. We highlight the disadvantages of this method and present the median absolute deviation, an alternative and more robust measure of dispersion that is easy to implement. We also explain the procedures for calculating this indicator in SPSS and R software.
Article
In today's competitive semiconductor manufacturing environment, improving fab productivity and reducing cost requires that all systems work collaboratively towards production, quality and cost targets. Equipment engineering systems (EES), including advanced process control have risen to the top as key enablers for maximizing fab productivity, however these systems have been hindered as they focus on equipment and process, rather than fab-wide, metrics. Extending EES into the yield management space allows for leveraging yield prediction information as feedback into closing the loop around the fab to better achieve productivity goals. With this added capability of yield prediction, EES can: 1) predict yield excursions as well as excursion sources (leveraging virtual metrology technology), thereby avoiding costly post mortem yield recovery activities; 2) improve EES capabilities such as maintenance management and scheduling/dispatch; and 3) utilize yield prediction information as feedback to all levels of control in the fab, from individual processes up through factory-level controllers so that processes can be continuously tuned to meet yield and device performance targets without resorting to design changes to slower products. Case studies of application of this feedback information illustrate that benefits can be achieved from straight-forward applications, however these benefits can be expanded (especially) as more complex scheduling and control solutions are implemented. Realizing this type of yield-enhanced EES solution (YMeAPC) in a cost-effective manner requires fab-wide adherence to standards and best practices for component integration, event-based control system operation, and user interface management.
Article
During wafer fabrication, process data, equipment data, and lot history will be automatically or semi-automatically recorded and accumulated in database for monitoring the process, diagnosing faults, and managing manufacturing. However, in high-tech industry such as semiconductor manufacturing, many factors that are interrelated affect the yield of fabricated wafers. Engineers who rely on personal domain knowledge cannot find possible root causes of defects rapidly and effectively. This study aims to develop a framework for data mining and knowledge discovery from database that consists of a Kruskal–Wallis test, K-means clustering, and the variance reduction splitting criterion to investigate the huge amount of semiconductor manufacturing data and infer possible causes of faults and manufacturing process variations. The extracted information and knowledge is helpful to engineers as a basis for trouble shooting and defect diagnosis. We validated this approach with an empirical study in a semiconductor foundry company in Taiwan and the results demonstrated the practical viability of this approach.
Article
When building prediction models in the semiconductor environment, many variables, such as input/output variables, have causal relationships which may lead to multicollinearity. There are several approaches to address multicollinearity: variable elimination, orthogonal transformation, and adoption of biased estimates. This paper reviews these methods with respect to an application that has a structure more complex than simple pairwise correlations. We also present two algorithmic variable elimination approaches and compare their performance with that of the existing principal component regression and ridge regression approaches in terms of residual mean square and R2. Copyright © 2011 John Wiley & Sons, Ltd.
Article
The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.
Article
Simultaneous inference is a common problem in many areas of application. If multiple null hypotheses are tested simultaneously, the probability of rejecting erroneously at least one of them increases beyond the pre-specified significance level. Simultaneous inference procedures have to be used which adjust for multiplicity and thus control the overall type I error rate. In this paper we describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. The framework described here is quite general and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalized linear models, linear mixed effects models, the Cox model, robust linear models, etc. Several examples using a variety of different statistical models illustrate the breadth of the results. For the analyses we use the R add-on package multcomp, which provides a convenient interface to the general approach adopted here.
Article
This paper studies the defect data analysis method for semiconductor yield enhancement. Given the defect locations on a wafer, the local defects generated from the assignable causes are classified from the global defects generated from the random causes by model-based clustering, and the clustering methods can identify the characteristics of local defect clusters. The information obtained from this method can facilitate process control, particularly, root-cause analysis. The global defects are modeled by the spatial non-homogeneous Poisson process, and the local defects are modeled by the bivariate normal distribution or by the principal curve.