Article

The goodness of fit of regression formulae, and the distribution of regression coefficients

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Many existing procedures can be interpreted as special cases of IIS in that they represent particular algorithmic implementations of IIS. As Table 1 summarizes, such special cases include Fisher's (1922) covariance statistic, recursive estimation, the Chow (1960) predictive failure statistic (including the one-step, breakpoint, and forecast versions implemented in OxMetrics), the unknown breakpoint test (proposed by Nyblom (1989); Hansen (1992); and Andrews (1993)); and the Bai and Perron (1998) multiple breakpoint test. IIS also includes rolling regression, the tests of extended constancy in Ericsson et al. (1998, p. 305ff), tests of nonlinearity, intercept correction (in forecasting), and robust estimation. ...
... Covariance equality Fisher (1922) Recursive estimates Plackett (1950) Forecast errors Chow (1960) Single unknown breakpoint in regression coefficients Nyblom (1989); Andrews (1993), Hansen (1992) Multiple unknown breakpoints in the intercept Bai and Perron (1998) Arbitrary unknown impulse breaks (impulse indicator saturation) Hendry (1999), Hendry et al. (2008), Johansen and Nielsen (2009) Arbitrary unknown step breaks (step indicator saturation) Castle et al. (2015) Step, trend, and coefficient breaks (super saturation, ultra saturation, and multi-saturation) ...
... This evidence on the baseline model does not, however, assess whether the estimated coefficients are constant over time. Fisher's (1922) covariance test statistic is one way of assessing constancy, but it does require specifying which subsamples to compare. Splitting the sample (arbitrarily) into one third and two thirds , Fisher's statistic is F(3, 71) = 2.41 [0.0743], ...
Article
Full-text available
Structural breaks have attracted considerable attention recently, especially in light of the financial crisis, Great Recession, the COVID-19 pandemic, and war. While structural breaks pose significant econometric challenges, machine learning provides an incisive tool for detecting and quantifying breaks. The current paper presents a unified framework for analyzing breaks; and it implements that framework to test for and quantify changes in precipitation in Mauritania over 1919–1997. These tests detect a decline of one third in mean rainfall, starting around 1970. Because water is a scarce resource in Mauritania, this decline—with adverse consequences on food production—has potential economic and policy consequences.
... where the joint distribution of the response and explanatory variables was assumed to be Gaussian [13,14]. Then Fisher showed that conditional distribution of the response variables needs to be Gaussian, but the joint distribution need not [15]. ...
... c t = f t * c t−1 + i t * g t (4. 15) In the right side of the Equation (4.15), if we use only the input gate with the sigmoid function acting as the activation function, we never obtain zero that will trigger a forget action because the sigmoid function produces positive outputs between 0 and 1. The tanh activation function, on the other hand, is more suitable because its outputs are between −1 and 1. ...
... 15 : Accuracy curve for RNN on the EMNIST data set. ...
Thesis
Full-text available
In this dissertation, we study Artificial Neural Networks and their taxonomy. Also, we analyze the results of our numerical experiments of applying three different types of neural network architectures on four different data sets.
... Feature Scaling During the preprocessing phase, this process is crucial because the majority of machine learning algorithms work much better when they deal with characteristics with the same scale [12]. Techniques most commonly used include: Normative scaling involves rescaling the features to an interval of [1,0], which constitutes a special case of min-max scaling. We will only need to scale each feature column using the min-max method to normalize the data. ...
... We will only need to scale each feature column using the min-max method to normalize the data. (Xmax -Xmin) ..... (1) "Standardization" simply involves standardizing each feature column at mean zero with a standard deviation of 1, so that the columns have the same parameters as a standard normal distribution" [13]. By doing this, it is much easier for the algorithms to determine what parameters to learn. ...
Article
Full-text available
People who want to buy a new home tend to save more on their budgets and market strategies. The current system includes real estate calculations without the necessary forecasts for future market trends and inflation. The housing market is one of the most competitive in terms of pricing and the same has varied greatly in terms of many factors. Asset pricing is an important factor in decision- making for both buyers and investors in supporting budget allocation, acquisition strategies and deciding on the best plans as a result, it is one of the most important areas in which machine learning ideas can be used to maximize and accurately anticipate prices. As a result, in this paper, we present the different significant factors that we employ to accurately anticipate property values. To reduce residual errors, we can utilize regression models with a range of characteristics. Some engineering aspects are required when employing features in the regression model for improved prediction. To improve model fit, a set of multi-regression elements or a polynomial regression (with a set of varying strengths in the elements) is frequently utilized. In these models, it is expected to be significantly affected by the slope of the spine used to reduce it. Therefore, it directs the best use of regression models over other strategies to maximize the effect. This paper's goal is to predict free hold prices for free hold consumers based on their budgets and goals. Prospective prices can be forecast by evaluating past market trends and price levels, as well as future developments.
... Statistical tests which partition the initial dataset and compare certain statistics of the partitions to detect model misspecification are usually applied in the literature. The original lack of fit test was developed by Fisher [43]. The test requires to have repeated measurements that is to have more than one observation at least at one given . ...
... and is the residual variance of the th batch. From Eq. (43) and Eq. (44): ...
Thesis
Full-text available
In the first part of the thesis, after the literature review, I present statistical methods used in the field of pharmaceutical quality control, more specifically in stability studies. This part considers the application of univariate and multivariate control charts and some newly developed statistical tests, as well. My work in this field contributes to the scientific advancement by correcting the suggested methods that can be found in the literature and suggesting new advanced methods to increase the effectiveness of quality control in pharmaceutical stability studies. In the second part of my thesis, after the literature review, I present the application of designed experiments and other statistical techniques to evaluate data obtained in high-pressure experiments. My work in this field contributes to the scientific advancement by presenting the model construction method and showing that designed experiment can be and should be applied in such experiments to increase the efficiency of the experimental studies. This part also contains the correction of an already existing, widely applied physical-chemical formula that is connected to solubility.
... The work supplies qualitative assessment of a total impact of the studied factors on the resultant index by means of MS Office Excel. A complex interaction of all factors with the resultant index can be described by a polynomial regression of n-order [5,16,22]: y = ao + a1 x + a2 x2 + a3 x3 … + an xn + ε. (3) A theoretical base for the study of cargo conveyance rules, moving of transport vehicles according to the set principles of control for keeping to the laws in the field of transport and conveyance of cargo in the countries of the EU is made by the following sources, in particular (Regulation ( ). Concerning Ukraine, the field is regulated by such regulatory acts, as the Law of Ukraine About transport [40], the Law of Ukraine About conveyance of dangerous cargo [41], the Law of Ukraine About technical guidelines and compliance assessment [39], the Resolution of the Cabinet of Ministers of Ukraine About transit of large and heavy motor vehicles by motor ways, streets and railway crossings [37], etc. ...
... Weight of transported cargo, ton 16 ...
Article
Full-text available
At the current stage of development of the branch of motor transport in Ukraine, satisfactory equipment of enterprises with the fixed assets, particularly motor transport enterprises, is one of the main problems. Fleet of road transport vehicles of those enterprises needs almost complete renovation, and productive-technical base for maintenance inspection and repair of transport requires complete technical re-equipment. The aim of the work is to develop theoretical approaches and practical recommendations concerning maintenance of road transport vehicles of motor transport enterprises, choice of the type of transportation means and determination of the necessary number of vehicles depending on the production plan of their operation.
... Статистик шинжилгээний арга: Регрессийн шинжилгээгээр хамаарлын хэлбэрийг тодорхойлж тэгшитгэлээр илэрхийлдэг (Fisher, 1922). Регрессийн шинжилгээний гол зорилго нь статистик хамааралтай үзэгдлүүдээс ямар нэгэн хамаарлын хэлбэрийг тодорхойлох, энэхүү хамаарал нь өгөгдсөн туршилтын болон хэмжилтийн утгуудтай хэр нийцэж байгааг тогтоосны үндсэн дээр тохирох тэгшитгэлийг гарган авч, тухайн судлагдаж буй объект хоорондын цаг хугацааны болон орон зайн хамаарал, ирээдүйн чиг хандлагыг тодорхойлоход оршино (Freedman, 2009). ...
Article
Full-text available
Өгий нуур нь Монгол орны төв хэсэгт байрлан, Хангайн уулархаг бүсийн баруун талд хил залгаж, тус нуруунаас эх аван урсах Орхон голын хөндийн системд хамаарагддаг нуур билээ. Дотоодын болон гадаадын аялагчдын зорих газрын томоохон төлөөлөл бол Өгий нуур орчмын аялал жуулчлалын бүс юм. Дэлхий дахинд хүний хүчин зүйлийн нөлөөллөөс үүдэн байгаль, экологид үүсэж буй сөрөг нөлөөг бууруулах зайлшгүй шаардлага тулгарч байна. Аялал жуулчлалын голлох чиглэлээс гадна мал аж ахуй эрхлэлтийн нөлөөлөлд энэ орчмын газрын гадарга сүүлийн жилүүдэд ихээхэн өөрчлөгдөж байна. Антропоген нөлөөллөөс шалтгаалсан Өгий нуур орчмын газрын гадаргын өөрчлөлтийг тодруулсан түүнд нөлөөлөх хүчин зүйлсийг судалсан судалгааны ажил одоогоор хийгдээгүй байна. Судалгааны үндсэн зорилго нь Өгий нуур орчмын газрын гадаргын өөрчлөлтийг антропоген нөлөөлөлтэй холбон тодорхойлох явдал юм. Энэ өгүүлэлд зайнаас тандан судлалын Ургамалжилтын нормчлогдсон индекс тооцоолох арга, харьцуулсан шинжилгээний арга, статистик шинжилгээний аргуудыг ашигласан. Судалгааны гол үр дүнгээ сансрын болон статистикийн, зураглалын материалд шинжилгээ хийх замаар тодорхойлсон. Өгий нуур орчмын газрын гадарга нь мал аж ахуй болон аялал жуулчлалын нөлөөллөөс үүдэн сүүлийн жилүүдэд ихээхэн өөрчлөгдөж буй нь судалгаагаар тогтоогдсон. Судалгааны үр дүнгүүдийг нэгтгэн үзэхэд Өгий нуурын талбай сүүлийн 30 орчим жилийн хугацаанд аялал жуулчлалын зориулалттай барилга байгууламж барих, шинээр худаг ухах, загасчлалын нөлөө,авто замуудын замбараагүй сүлжээ, мал аж ахуй эрхлэлттэй холбоотой хүний хүчин зүйлсээс шалтгаалж газрын гадаргад эвдрэл, доройтол илэрч, ургамлын бүрхэвчийн хэмжээ багассан зүй тогтол ажиглагдаж байна. Өгий нуур орчимд сүүлийн 30 жилд 136.2 км замын сүлжээ үүссэн байна. Судалгааны талбайд 2004-2019 оны хугацаанд 7 барилга 10.3 га талбайгаар тэлж, түүнтэй ойролцоо 70 гаруй га талбайд антропоген нөлөөллөөр газрын гадаргад өөрчлөлт орсон байна. Энэхүү судалгаа нь аялал жуулчлалын зорих газрыг зөв зохисгүй байгальд ээлгүй ашиглалтаас үүдэх үр дагаврыг тодорхойлсноороо ач холбогдолтой юм.
... Bir regresyon modelinin oluşturulmasında modellenecek bağımlı değişkenin normal dağılıma uygun olması şartı aranmaktadır (Fisher, 1922). Dört farklı sürüş hızı için oluşturacağımız konforsuzluk seviyesi tahmin modellerinde, bağımlı değişken awz değerlerinin normal dağılıma uygunluğunu araştırmak amacıyla Kolmogorov-Smirnov ve Shapiro-Wilk Testleri uygulanmıştır. ...
Article
Full-text available
The study, it is aimed to mathematically model the relationships between the Pavement Condition Index (PCI), which is used as a pavement performance indicator, and the amount of whole-body vibration exposure in a passenger car. Vibration measurements were analyzed according to the frequency-weighted data processing method, the technical details of which were explained in the ISO 2631 standard, and aw values were obtained in the vertical direction. Mathematical relationships between PCI values and actual vibration measurement data in the range of 20-50 km/h ride speed were modeled using linear regression analysis on the road sections determined in an urban road network with a bituminous hot-mixed pavement. The statistical compatibility of the models was examined. Through the mathematical models generated for each ride speed, the threshold values of PCI that affect ride comfort, and the ride comfort values corresponding to the PCI limit values in the traditional evaluation scale recommended by the PAVER system were determined. In the evaluated speed range, the PCI limit values were 0, 11, 29, 41 on the ‘a little uncomfortable - fairly uncomfortable’ threshold, 37, 62, 69, 77 on the ‘not uncomfortable - a little uncomfortable’ threshold, respectively. Finally, the threshold values produced by the linear regression method were compared with the threshold values obtained by logistic regression, fuzzy logic and artificial neural network techniques in previous studies in the literature. It was determined linear regression analysis generated lower PCI threshold values than other techniques.
... A smaller p-value means a stronger inconsistency. If the p-value is less than a predetermined threshold (significance level), the original hypothesis is considered to be rejected [51,52]. ...
Preprint
Full-text available
Contrast pattern mining (CPM) is an important and popular subfield of data mining. Traditional sequential patterns cannot describe the contrast information between different classes of data, while contrast patterns involving the concept of contrast can describe the significant differences between datasets under different contrast conditions. Based on the number of papers published in this field, we find that researchers' interest in CPM is still active. Since CPM has many research questions and research methods. It is difficult for new researchers in the field to understand the general situation of the field in a short period of time. Therefore, the purpose of this article is to provide an up-to-date comprehensive and structured overview of the research direction of contrast pattern mining. First, we present an in-depth understanding of CPM, including basic concepts, types, mining strategies, and metrics for assessing discriminative ability. Then we classify CPM methods according to their characteristics into boundary-based algorithms, tree-based algorithms, evolutionary fuzzy system-based algorithms, decision tree-based algorithms, and other algorithms. In addition, we list the classical algorithms of these methods and discuss their advantages and disadvantages. Advanced topics in CPM are presented. Finally, we conclude our survey with a discussion of the challenges and opportunities in this field.
... In contrast, a value close to one shows that the independent variables can explain the dependent variables. The value is desired to be close to one (Fisher, 1922). ...
Article
Full-text available
Solar radiation, which is used in hydrological modeling, agricultural, solar energy systems, and climatological studies, is the most important element of the energy reaching the earth. The present study compared, the performance of two empirical equations -Angstrom and Hargreaves-Samani equations- and, three machine learning models -Artificial Neural Networks (ANN), Support Vector Machine (SVM), and Long Short-Term Memory (LSTM)-. Various learning models were developed for the variables used in each empirical equation. In the present study, monthly data of six stations in Turkey, three stations receiving the most solar radiation and three stations receiving the least solar radiation, were used. In terms of the mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and determination coefficient () values of each model, LSTM was the most successful model, followed by ANN and SVM. The MAE value was 2.65 with the Hargreaves-Samani equation and, decreased to 0.987 with the LSTM model while MAE was 1.24 in the Angstrom equation and decreased to 0.747 with the LSTM model. The study revealed that the deep learning model is more appropriate to use compared to the empirical equations even in cases where there is limited data.
... The Fisher test (see [6]) with a significance level η = 0.05: ...
Article
Full-text available
Confocal microscope images are wide useful in medical diagnosis and research. The automatic interpretation of this type of images is very important but it is a challenging endeavor in image processing area, since these images are heavily contaminated with noise, have low contrast and low resolution. This work deals with the problem of analyzing the penetration velocity of a chemotherapy drug in an ocular tumor called retinoblastoma. The primary retinoblastoma cells cultures are exposed to topotecan drug and the penetration evolution is documented by producing sequences of microscopy images. It is possible to quantify the penetration rate of topotecan drug because it produces fluorescence emission by laser excitation which is captured by the camera. In order to estimate the topotecan penetration time in the whole retinoblastoma cell culture, a procedure based on an active contour detection algorithm, a neural network classifier and a statistical model and its validation, is proposed. This new inference model allows to estimate the penetration time. Results show that the penetration mean time strongly depends on tumorsphere size and on chemothera-peutic treatment that the patient has previously received.
... Agora, este problema já deve ser familiar: os mínimos quadrados nos dão estimativas do intercepto e da inclinação, mas precisamos medir a incerteza sobre eles para julgar o quanto podemos confiar que essa relação não surgiu por acaso na nossa amostra. Se não houver relação nenhuma entre as variáveis, Y deve ser constante ao longo de X. Logo, a hipótese nula que queremos testar é b = 0. Ronald Fisher (1922a) mostrou que a distribuição nula dos coeficientes de uma regressão segue a distribuição da estatística t, criada por William Gosset em 1908. Por isso, podemos testar a inclinação usando esta estatística, e maioria dos programas de análise estatística faz exatamente isso. ...
Method
Full-text available
Esta apostila nasceu como material de apoio à disciplina de “Modelos Lineares”, oferecida através do Programa de Pós-graduação em Ecologia do Instituto Nacional de Pesquisas da Amazônia (INPA) a partir de 2018. O objetivo da disciplina é oferecer uma visão clara e unificada dos principais conceitos envolvidos no uso de modelos lineares – teste t, ANOVA, ANCOVA, regressão, GLM(M) e suas variações – para responder questões científicas. . O objetivo da apostila é demonstrar que os aparentemente infinitos métodos de análise estatística são, essencialmente, apenas um: um modelo linear. Isto é, um modelo que usa uma linha (geralmente reta) para sumarizar uma nuvem de pontos que reflete a relação entre duas (ou mais) variáveis. Nossa experiência é que essa compreensão ajuda a organizar a confusão na qual a maioria dos estudantes sem formação nas chamadas “ciências exatas” se perde ao tentar aprender estatística, facilitando sua aplicação consciente. Nossa visão é que, assim como ninguém precisa ser engenheiro para dirigir um veículo ou usar um computador, ninguém precisa ser matemático para ser capaz de usar estatística para responder questões de seu interesse. Usamos pouca matemática, e assumimos familiaridade apenas com as quatro operações básicas. Em contraste, usamos muitas figuras (70 figuras) para facilitar a comunicação. . A apostila é um projeto em construção. Fique atento para novas versões!
... The Fisher test (see [6]) with a significance level η = 0.05: ...
Article
Full-text available
Confocal microscope images are wide useful in medical diagnosis and research. The automatic interpretation of this type of images is very important but it is a challenging endeavor in image processing area, since these images are heavily contaminated with noise, have low contrast and low resolution. This work deals with the problem of analyzing the penetration velocity of a chemotherapy drug in an ocular tumor called retinoblastoma. The primary retinoblastoma cells cultures are exposed to topotecan drug and the penetration evolution is documented by producing sequences of microscopy images. It is possible to quantify the penetration rate of topotecan drug because it produces fluorescence emission by laser excitation which is captured by the camera. In order to estimate the topotecan penetration time in the whole retinoblastoma cell culture, a procedure based on an active contour detection algorithm, a neural network classifier and a statistical model and its validation, is proposed. This new inference model allows to estimate the penetration time. Results show that the penetration mean time strongly depends on tumorsphere size and on chemotherapeutic treatment that the patient has previously received.
... Fisher's exact test 35 tool at Cenargen Bioinformatics platform was used to compare and nd out the levels and signi cance of contig expression between generated libraries which had passed through quality control. ...
Preprint
Full-text available
Heavy metal-contaminated soils are a widespread problem in mine field environments. To gain insights on the structural and functional diversity of bacterial community present in chromium contaminated mine soil obtained from Sukinda Chromite mine of Odisha (India), whole metagenomic analysis was performed by using Illumina HiSeq platform. Chromite mine soils contain high concentrations of heavy metals such as Mn, Fe, Cu besides chromium and low concentrations of organic carbon, available nitrogen, phosphorus and potassium. The bacterial community living in that hostile environment was correlated with the inherent abilities of these strains to be involved in the reduction of the major heavy metal Cr(VI). Our results showed that Proteobacteria (66.45%) were found to be the most abundant in the study area, followed by Actinobacteria (17.32%), Bacteroidetes (4.65%). The Keto Encyclopedia of Genes and Genomes (KEGG) functional category has 228543 predicted functions, 20% of which were involved in cellular metabolic functions, 5.21% in genetic information processing and 5.13% in environmental and information processing, while SEED functional category has 112542 predicted functions. Of which 11.73% are involved in carbohydrate metabolism with 13202 hits, followed by 10.57% amino acids and derivatives with 11899 hits, 8.24% protein metabolism with 9282 hits, 6.88% DNA metabolism with 7746 hits with and 5.94% cofactors, vitamins, prosthetic group, pigments with 6687 hits. The isolated bacterial consortia exhibited visible growth up to a Cr concentration of 800 mg L ⁻¹ . The results presented in this study have important implications for understanding bacterial diversity in chromium-contaminated mine soils and their role in exhibiting Cr(VI) resistance.
... Statistical tests which partition the initial dataset and compare certain statistics of the partitions to detect model misspecification usually applied in the literature. The original lack of fit test was developed by Fisher [17]. The test requires to have repeated measurements at any given x i . ...
Article
Full-text available
Pharmaceutical stability studies are conducted to estimate the shelf life, i.e. the period during which the drug product maintains its identity and stability. In the evaluation of process, regression curve is fitted on the data obtained during the study and the shelf life is determined using the fitted curve. The evaluation process suggested by ICH considers only the case of the true relationship between the measured attribute and time being linear. However, no method is suggested for the practitioner to decide if the linear model is appropriate for their dataset. This is a major problem, as a falsely selected model may distort the estimated shelf life to a great extent, resulting in unreliable quality control. The difficulty of model misspecification detection in stability studies is that very few observations are available. The conventional methods applied for model verification might not be appropriate or efficient due to the small sample size. In this paper, this problem is addressed and some developed methods are proposed to detect model misspecification. The methods can be applied for any process where the regression estimation is performed on independent small samples. Besides stability studies, frequently performed construction of single calibration curves for an analytical measurement is another case where the methods may be applied. It is shown that our methods are statistically appropriate and some of them have high efficiency in the detection of model misspecification when applied in simulated situations which resemble pre-approval and post-approval stability studies.
... Статистик шинжилгээний арга: Регрессийн шинжилгээгээр хамаарлын хэлбэрийг тодорхойлж тэгшитгэлээр илэрхийлдэг (Fisher, 1922). Регрессийн шинжилгээний гол зорилго нь статистик хамааралтай үзэгдлүүдээс ямар нэгэн хамаарлын хэлбэрийг тодорхойлох, энэхүү хамаарал нь өгөгдсөн туршилтын болон хэмжилтийн утгуудтай хэр нийцэж байгааг тогтоосны үндсэн дээр тохирох тэгшитгэлийг гарган авч, тухайн судлагдаж буй объект хоорондын цаг хугацааны болон орон зайн хамаарал, ирээдүйн чиг хандлагыг тодорхойлоход оршино (Freedman, 2009). ...
Article
Full-text available
The Lake Ugii is a freshwater lake located in the valley of Orkhon River in Arkhangai province. The Lake Ugii is one of the main attraction of tourist activities and near the Kharkhorin historical area. There is an urgent need to reduce the negative impact of human activities on nature and ecology around the world. In addition to the main areas of tourism, the land cover change and land use types have been changed quickly with the anthropogenic impact in the study area. There is no study focusing on land cover changes with anthropogenic impact and lake water surface area. The main purpose of the study is to identify changes in the surface of the the Lake Ogii area in connection with anthropogenic impacts. Remote sensing based method of Normalized Difference Vegetation index (NDVI), was used about with comparative analysis and statistical analysis methods. Study finding suggest that the surface area around the Lake Ugii changed significantly in recent years due to the impact of livestock and tourism. The result has shown that the Lake Ugii area has been changed over the last 30 years due to the construction of tourism facilities, digging new wells, the impact of fishing, the road network, and human activities. Uncovered road length has been reached up to 136 km near the Lake Ugii in the last 30 years. Between 2004 and 2019, seven buildings in the study area expanded by 10.3 hectares, and more than 70 hectares of land were affected by anthropogenic changes. This study is the consequence of improper land use in a tourist destination.
... The goodness-of-fit test uses the predictive-relevance (Q2) value in statistical hypothesis testing to analyze how well the sample data fits a distribution from a normally distributed population [23]. In addition, the difference between the observed value and the expected value of the normal distribution can be done by applying the Goodness-of-fit test [24]. The analysis showed that the predictiverelevance value reached 0.787 (78.7%); which indicates that the research model or design has a relevant predictive value. ...
Chapter
Full-text available
Organizational development requires human resources. Professional organizations manage systems and the organizational mechanisms of existing resources to support flexible responses of change. Hospitals are a part of the service industry and have extraordinarily complex business processes, and a large potential for optimization and efficiency improvements. The aim of this study is to explore the relationship between emotional intelligence and team performance during the inter-institutionalized collaboration work process. This study was conducted in the South Sulawesi and Central Sulawesi Province’s hospitals. The study lasted for six months in 2017. The sampling was done by using the cluster method and stratified random sampling, which was based on hospital type and the level of health officers. The data analysis approach used in this study was the partial least square (PLS), using WarpPLS software. The results show that emotional intelligence significantly and positively affected team performance with a path coefficient value of 0.138 and a p-value of 0.050. Based on the results of the data analysis, it can be concluded that there is a significant direct influence of emotional intelligence on team performance.
... Além da análise apresentada, se fosse possível a aquisição de todos os tempos de chegada dos pacotes para o roteamento SF, poder-se-iam obter os mesmos valores dos coeficientes angular e linear através do cálculo da reta média entre os pontos, usando-se o método de regressão linear [5] [6]. Nesse caso, como os parâmetros da reta obtida seriam exatos, devido a todos os pontos estarem sobre esta, os coeficientes seriam obtidos de forma exata. ...
... In this period, the dependent and independent variables are assumed to have normal distributions. This assumption was extended with Fisher's publications in 1922 and1925 to apply only to cases in which the conditional distribution of the dependent variable is normal (Fisher, 1922). ...
Conference Paper
Full-text available
... In this period, the dependent and independent variables are assumed to have normal distributions. This assumption was extended with Fisher's publications in 1922 and1925 to apply only to cases in which the conditional distribution of the dependent variable is normal (Fisher, 1922). ...
Conference Paper
Full-text available
Classification and Regression Tree (CART) is a predictive algorithm method used to explains how the dependent variable can be predicted using independent variables (numerical and characters). The aim of the study was to identify importance of biometric traits on estimation of body weight of Boer goats. A total of seventy-one Boer goats between the age of one year and two years old were used. CART was used for data analysis. CART findings displayed that sex played a crucial role on body weight of Boer goats. CART model established in this study could be used by breeders to advice Boer goats’ farmers who cannot afford to purchase weighing scale on which biometric traits they can use to select their animals in order to improve their herd. However, further studies need to be done to validate the use of CART in prediction of body weight from biometric traits of Boer goats using large simple size, different area or other goat breeds. Key words: Body length, Ear length, Heart girth, Head width, Rump width
... In 1815 Gergonne wrote a paper on "The application of the method of least squares to the interpolation of sequences" [5] and its English translation by St. John and Stigler [6]. In the last 120 years, polynomial regressions contributed much to the development of regression analysis [7][8][9], which have many diverse applications, including an interesting one in polymerase chain reaction bias correction in quantitative DNA methylation studies [10]. ...
Article
Full-text available
An accurate estimation of the noise power from noisy data leads to better estimation of signal-to-noise ratio (SNR) and is useful in detection, estimation, and prediction. The major contributions of this paper are to estimate the polynomial degree and the noise power from data coming from an underlying polynomial with additive Gaussian noise, using an AR model. The two proposed methods have been inspired by the recent results that all finite degree polynomials have equivalent representation in finite order autoregressive (AR) models, with known AR coefficients and different constant terms. Preliminary experiments in a variety of scenarios provide estimations of the constant term and the standard deviation of these estimations, which are then used as a guide to developing theoretically the probability density functions. In the first stage, the degree of a polynomial is selected by minimizing the variance of the estimations of the constant term in the equivalent AR model. In the second stage, the noise variance is estimated using the estimated degree of a polynomial, a combination of the variance of the estimations of the constant term, and another known parameter. Further computer experiments have been carried out for evaluating the proposed methods for degree and noise power estimations. Four well-known and well-regarded maximum likelihood-based approaches have been used for comparisons.
... More recently, researchers have investigated the effects of GHRM on environmental performance by using MLR [16], [17], [18]. Inadequate incentives from top management as the main obstacle to environmental orientations have been acknowledged in some researches [19], [20]. MLR analysis relevant to the forecasting of new findings and to the assessment of future sustainable GHRM regulations. ...
Conference Paper
Green Human Resource Management (Green HRM) is an emerging subject in the global context today. That step towards sustainable organizational trading performances to profit optimization. This is a recent phenomenon in Sri Lanka, where Green HRM practically struggles to deal with adverse environmental consequences within many firms. Many environmental issues are frequently formed in line with broad corporate imperfections. Therefore, the decline of ecological imbalance along with green economy should be strongly considered by corporate sector. The sector of the firm might be affected to change the potential for Green HRM performance, and the difference between public and private sector firms needs to be understood. This study attempts to analyze the performances of Green HRM between public and private sector firms as a Green HRM initiative. Two research objectives were set for this intent; identifying significant Green HRM practices between public and private sector firms on solid waste disposal and creation of the Relative Important Index (RII) for defining possible gaps between the both sectors. To achieve the 1st objective, 8 socioeconomic indicators and 8 Green HRM practices were arranged on solid waste (SW) disposal. Multiple Linear Regression (MLR) results reveal that age, number of employees, SW-reduction and use of separate bins indicators are highly significant at 0.000. 2nd objective was achieved by using the Relative Important Index (RII). Keywords-Green Human Resource Management (Green HRM), Green HRM practices, public and private sector firms, Relative Important Index (RII)
... Regression analysis is used for curve fitting purposes in most of the current studies. Thus, numerical modeling of events that are difficult to model analytically is provided in the light of experimental data [9,[10][11][12][13][14]. In this study, numerical modeling has been done by using the Hybrid method whose algorithm is given in Equation (1), (2) and (3). ...
Article
Full-text available
In this study, the effect of tire inflation pressure on the vehicle during braking was investigated experimentally. The experimental study was carried out in the Brake-Suspension Test Device designed as a half-vehicle model, which enables vehicle brake tests to be performed in the laboratory. During the tests, a total of nine different tire inflation pres-sures from 26 psi to 40 psi standard value were taken into consideration. As a result of the experiments carried out separately for each tire infla-tion pressure, a curve characterizing the effect of different pressure val-ues on the pitching force was obtained. Based on the experimental re-sults, different mathematical models suitable for the curve obtained are derived by nonlinear curve fitting. For the nonlinear curve fitting pro-cess, a hybrid iterative regression algorithm obtained by combining clas-sical curve fitting algorithm with Newton-Raphson iteration method was used. It was observed that the correlation coefficients (R2) of the mathe-matical models obtained provided sufficient sensitivity to express the problem under consideration.
... The Battese and Coelli (1992) and Nishimuzu and Page (1982) functions were used to estimate and decompose GTFP before establishing the series of GTFP components for the period of review. The Maximum Likelihood [ML] estimation technique (Fisher 1912(Fisher , 1921(Fisher , 1922a(Fisher , 1922b was also used to generate the estimated variables of the stochastic production function applied in analysis. Tables, percentages, graphs and E-views econometric package of data analysis were utlised to analyse and interpret data generated for the study. ...
... The Battese and Coelli (1992) and Nishimuzu and Page (1982) functions were used to estimate and decompose GTFP before establishing the series of GTFP components for the period of review. The Maximum Likelihood [ML] estimation technique (Fisher 1912(Fisher , 1921(Fisher , 1922a(Fisher , 1922b was also used to generate the estimated variables of the stochastic production function applied in analysis. Tables, percentages, graphs and E-views econometric package of data analysis were utlised to analyse and interpret data generated for the study. ...
... Several model specifications were tested and compared with the final reported model chosen based on statistical goodness of fit indices, such as the log-likelihood function (LL), the Akaike Information Criteria (AIC) and the Bayesian Information Criteria (BIC), as well as behavioural correctness. The LL function is a logarithmic transformation of the likelihood function that measures the goodness of fit of a statistical model to a sample of data for given values of the unknown parameters (Fisher, 1922). The AIC and BIC estimate the amount of information lost by a model, with lower AIC and BIC values being indicative of relatively higher quality models. ...
Article
The development of policies promoting smart meter adoption is essential to guide the transition towards sustainable use of resources such as water, electricity and gas, as well as inform smart-city initiatives. This article explores household preferences in terms of different smart meters and identifies the amounts that households are willing to pay for different smart meter configurations to monitor electricity, water and gas based on the features of their home including dwelling type, size and property value. To this aim, we employ a mixed multinomial logit model that accounts for the heterogeneity in customers’ preferences for different smart meters. As a proof of concept, the proposed model is applied to a survey incorporating a discrete choice experiment carried out with 232 respondents in the Florianopolis metropolitan region, located in the south of Brazil. Our approach offers a number of advantages to facilitate the broader implementation of smart grid systems that would otherwise be overlooked using traditional approaches that rely on aggregated estimates for demand and willingness to pay for proposed schemes.
... This is an English translation by St. John and Stigler [4] of the original paper that was written in French. In the last 120 or so years, polynomial regressions contributed greatly to the development of regression analysis [5][6][7]. Few more recent and interesting diverse applications can be found in computer graphics [8], machine learning [9], and statistics [10], including robust regressions [11] without the use of the Least-Squares method. ...
Article
Full-text available
Given a set of noisy data values from a polynomial, determining the degree and coefficients of the polynomial is a problem of polynomial regressions. Polynomial regressions are very common in engineering, science, and other disciplines, and it is at the heart of data science. Linear regressions and the least squares method have been around for two hundred years. Existing techniques select a model, which includes both the degree and coefficients of a polynomial, from a set of candidate models which have already been fitted to the data. The philosophy behind the proposed method is fundamentally different to what have been practised in the last two hundred years. In the first stage only the degree of a polynomial to represent the noisy data is selected without any knowledge or reference to its coefficient values. Having selected the degree, polynomial coefficients are estimated in the second stage. The development of the first stage has been inspired by the very recent results that all polynomials of degree q give rise to the same set of known time-series coefficients of autoregressive models and a constant term μ. Computer experiments have been carried out with simulated noisy data from polynomials using four well known model selection criteria as well as the proposed method (PTS1). The results obtained from the proposed method for degree selection and predictions are significantly better than those from the existing methods. Also, it is experimentally observed that the root-mean square (RMS) prediction errors and the variation of the RMS prediction errors from the proposed method appear to scale linearly with the standard deviations of the noise for each degree of a polynomial.
... The interaction model moreover allows interactions between the selected promoter regions to explain the REN activation. Regarding modelling we used the multiple linear regression model to fit the linear parameters, which is a standard statistical method [45]. In order to check which of the generated models can explain the measured data best, the respective absolute prediction error was calculated. ...
Article
Full-text available
Background Understanding complex mechanisms of human transcriptional regulation remains a major challenge. Classical reporter studies already enabled the discovery of cis-regulatory elements within the non-coding DNA; however, the influence of genomic context and potential interactions are still largely unknown. Using a modified Cas9 activation complex we explore the complexity of renin transcription in its native genomic context. Methods With the help of genomic editing, we stably tagged the native renin on chromosome 1 with the firefly luciferase and stably integrated a programmable modified Cas9 based trans-activation complex (SAM-complex) by lentiviral transduction into human cells. By delivering five specific guide-RNA homologous to specific promoter regions of renin we were able to guide this SAM-complex to these regions of interest. We measured gene expression and generated and compared computational models. Results SAM complexes induced activation of renin in our cells after renin specific guide-RNA had been provided. All possible combinations of the five guides were subjected to model analysis in linear models. Quantifying the prediction error and the calculation of an estimator of the relative quality of the statistical models for our given set of data revealed that a model incorporating interactions in the proximal promoter is the superior model for explanation of the data. Conclusion By applying our combined experimental and modelling approach we can show that interactions occur within the selected sequences of the proximal renin promoter region. This combined approach might potentially be useful to investigate other genomic regions. Our findings may help to better understand the transcriptional regulation of human renin.
... This is an English translation by St. John and Stigler [4] of the original paper that was written in French. In the last 130 or so years, polynomial regression contributed greatly to the development of regression analysis [5][6][7]. ...
Article
Full-text available
Two of the data modelling techniques - polynomial representation and time-series representation – are explored in this paper to establish their connections and differences. All theoretical studies are based on uniformly sampled data in the absence of noise. This paper proves that all data from an underlying polynomial model of finite degree q can be represented perfectly by an autoregressive time-series model of order q and a possible constant term μ as in equation (2). Furthermore, all polynomials of degree q are shown to give rise to the same set of time-series coefficients of specific forms with the only possible difference being in the constant term μ. It is also demonstrated that time-series with either non-integer coefficients or integer coefficients not of the aforementioned specific forms represent polynomials of infinite degree. Six numerical explorations, with both generated data and real data, including the UK data and US data on the current Covid-19 incidence, are presented to support the theoretical findings. It is shown that all polynomials of degree q can be represented by an all-pole filter with q repeated roots (or poles) at z = +1. Theoretically, all noise-free data representable by a finite order all-pole filter, whether they come from finite degree or infinite degree polynomials, can be described exactly by a finite order AR time-series; if the values of polynomial coefficients are not of special interest in any data modelling, one may use time-series representations for data modelling.
Chapter
In this chapter, a review of the machine learning (ML) and pattern recognition concepts is given, and basic ML techniques (supervised, unsupervised, and reinforcement learning) are described. Also, a brief history of ML development from the primary works before the 1950s (including Bayesian theory) up to the most recent approaches (including deep learning) is presented. Then, an introduction to the support vector machine (SVM) with a geometric interpretation is given, and its basic concepts and formulations are described. A history of SVM progress (from Vapnik’s primary works in the 1960s up to now) is also reviewed. Finally, various ML applications of SVM in several fields such as medical, text classification, and image classification are presented.KeywordsMachine leaningPattern recognitionSupport vector machineHistory
Chapter
In the previous chapters we assumed that the data are generated from an exchangeable probability measure. In this chapter we generalize the method of conformal prediction to cover arbitrary statistical models that belong to the class of, as we call them, online compression models. Interesting online compression models include, e.g., partial exchangeability models, Gaussian models, and causal networks.KeywordsOnline compression modelRepetitive structureOne-off structureExchangeability modelPartial exchangeability modelGaussian modelGauss linear modelMultivariate Gaussian model
Chapter
In the paper the deep learning model based on deep neural networks for predicting the ROPO behaviour of tourists is compared with classical discriminant analysis techniques as: linear discriminant analysis, kernel discriminant analysis, KNN method, SVM and classification trees on a real dataset containing the results of survey prepared by authors. In the second part of the paper, the methods of deep neural network tunning will be used on the same dataset and their effectiveness in improving the quality of the model will be assessed. The achieved results will show that there are some situations when deep learning gives better results than classical methods of discriminant analysis for economic datasets (like the dataset describing the ROPO behaviour in this study) but it requires additional research on why certain combinations of net architecture, loss function, optimizers and selected parameters behaves better than others.KeywordsROPODeep learningDeep neural networksDiscriminant analysis
Chapter
Eugen Slutsky is well-known to any graduate student in economics for two landmark articles and two operational concepts bearing his name, one in the field of consumer and utility theory (“the Slutsky equation”), the other in the field of the theory of cycles, introducing autonomous and exogenous causes in the analysis of macroeconomic fluctuations (“the Slutsky–Yule effect”). Because of the historical and political circumstances he had to cope with in Ukraine and then in Russia and in the USSR, Slutsky was prevented from devoting himself fully to mathematical economics. He only published a handful more of articles dealing with economics. Over the last twenty years, researchers in Europe, Ukraine, and Russia have been involved in making his contributions to mathematics and economics better known. By now, we get a clearer picture of Slutsky’s views on economics, and we know his network of connections with Western scholars who contributed to draw attention to his work. This essay highlights Slutsky’s lasting importance in economics, focusing on the fate of his major and lesser-known works.
Chapter
Bivariate analysis is used to obtain a better understanding of the relationship between two variables x and y, such as the length and width of a fossil, the sodium and potassium content of volcanic glass, or the organic matter content along a sediment core. When the two variables are measured on the same object, x is usually identified as the independent variable, and y as the dependent variable. If both variables have been generated in an experiment, the variable manipulated by the experimenter is described as the independent variable.
Chapter
The generalized maximum entropy model and minimum divergence estimation are examined in a framework of regression paradigm, which is one of the most typical applications in supervised learning. This chapter begins with the linear regression analysis, in which the theory for the least squares estimator (LSE) has been established in the nineteenth century. Under the normal distribution model, the maximum likelihood estimator (MLE) is equal to the LSE, in which the Pythagorean theoremPythagoras theorem holds via the Kullback-Leibler (KL) divergence in an elementary manner. This property is generalized that under the t-distribution modelT-distribution model, the γ-power estimator is equal to the LSE with the power γ adjusted to the degree of freedom of the t-distribution. Similarly, the Pythagorean theoremPythagoras theorem holds for the γ-power divergence. Next, we consider the applications of φ-path using the Kolmogorov-Nagumo mean. A quasi-linear modelingQuasi-linear regression model in a regression setting is introduced. Finally, we discuss a regression approach on the space of positive-definite matrices in a context of manifold learnings, in which a problem of human color perception is challenged.
Article
Full-text available
This paper proposes a novel user-preference-aware power scheduling scheme for application at an electric vehicle (EV) charging facility. Here, the preference of the EV user is characterized as a utility function by considering two different factors: 1) a satisfaction factor according to the charged energy, and 2) payment for the received charging service. As a key component of the power scheduling method, this paper proposes a two-stage power charging method and analyzes the advantage of the proposed method from a monetary perspective. In addition, this paper analyzes the economic benefits of a charging facility and EVs using the single-leader, multi-follower Stackelkberg game. By showing the existence of a unique best response for each participant, this paper presents the improvement of financial profit of EV users and charging facility through comparison with the case where the charging facility does not apply a proper power management scheme. Based on the actual world datasets, this paper shows that the proposed power scheduling scheme is feasible for actual environment. With satisfying the constraints, it is possible to reduce overall electricity cost up to 8.59% compared to the case without considering the peak power in EV charging facility.
Article
Full-text available
Resumen En este artículo presentamos una propuesta de niveles progresivos, de lo informal a lo formal, de razonamiento inferencial para el estadístico t-Student, a partir de criterios epistémicos identificados con un estudio de tipo histórico-epistemológico sobre este estadístico y de la investigación desarrollada sobre razonamiento inferencial. Para ello, utilizamos algunas nociones teórico-metodológicas introducidas por el Enfoque Onto-Semiótico del conocimiento y la instrucción matemáticos (EOS), las cuales permitieron tanto identificar y caracterizar diversos significados conferidos al estadístico t-Student, a lo largo de su evolución y desarrollo, como presentar una perspectiva integral de lo que se considera razonamiento inferencial. Los atributos matemáticos de los diversos significados del estadístico t-Student se encuentran fuertemente vinculados a los indicadores de los distintos niveles de razonamiento aquí expuestos. Además, cada nivel se encuentra asociado a un razonamiento inferencial informal, pre-formal o formal. La propuesta de niveles de Razonamiento Inferencial para el estadístico t-Student y sus indicadores, se prevén útiles para el diseño de actividades que promuevan, gradualmente, un razonamiento inferencial formal sobre la base del razonamiento inferencial informal, sobre este estadístico.
Article
Professor A.L. Nagar was a world-renowned econometrician and an international authority on finite sample econometrics with many path-breaking papers on the statistical properties of econometric estimators and test statistics. His contributions to applied econometrics have been also widely recognized. Nagar’s 1959 Econometrica paper on the so-called k-class estimators, together with a later one in 1962 on the double-k-class estimators, provided a very general framework of bias and mean squared error approximations for a large class of estimators and had motivated researchers to study a wide variety of issues such as many and weak instruments for many decades to follow. This paper reviews Nagar’s seminal contributions to analytical finite sample econometrics by providing historical backgrounds, discussing extensions and generalization of Nagar’s approach, and suggesting future directions of this literature.
Chapter
We have a statistical introduction for Information Geometry focusing the Pythagoras theorem in a space of probability density or mass functions when the squared length is defined by the Kullback–Leibler divergence. It is reviewed that the Pythagoras theorem extends to a foliation structure of the subspace associated with the maximum likelihood estimator (MLE) under the assumption of an exponential model. We discuss such a perspective in a framework of regression model. A simple example of the Pythagoras theorem comes out the Gauss least squares in a linear regression model, in which the assumption of a normal distribution is strongly connected with the MLE. On the other hand, we consider another estimator called the minimum power estimator. We extend the couple of the normal distribution model and the MLE to another couple of the t-distribution model and the minimum power estimator, which exactly associate with the dualistic structure if the power defining the estimator is matched by the degrees of freedom of the t-distribution. These observations can be applied in a framework of generalized linear model. Under an exponential-dispersion model including the Bernoulli, Poisson, and exponential distributions, the MLE leads to the Pythagoras foliation, which reveals the decomposition of the deviance statistics. This is parallel to the discussion for the residual sum of squares in the Pythagoras theorem.
Article
The origin–destination (OD) demand matrix plays an essential role in travel modeling and transport planning. Traditional OD matrices are estimated from expensive and laborious traffic counts and surveys. Accordingly, this study proposes a new combined methodology to estimate or update OD matrices (urban mobility) directly from easy-to-obtain and free-of-charge socioeconomic variables. The Málaga region, Spain, was used as a case study. The proposed methodology involves two stages. First, an automatic feature selection procedure was developed to determine the most relevant socioeconomic variables, discarding the irrelevant ones. Several feature selection techniques were studied and combined. Second, machine learning (ML) models were used to estimate mobility between predefined zones. Artificial neural networks (ANNs) and support vector regression (SVR) were tested and compared using the most relevant variables as inputs. The experimental results show that the proposed combined model can be more accurate than traditional methods and ML models without the feature selection procedure. In particular, SVR with feature selection slightly outperformed the combined model using ANNs. The proposed methodology can be a promising and affordable alternative method for estimating OD matrices, reducing costs and lead time significantly, and assisting and improving urban transport planning.
Chapter
This chapter explores how Fisher, and then Neyman and Pearson, tried to build a theory of statistical inference from the frequency definition of probability. Fisher, despite his rejection of Laplacean inverse probability, remained with the epistemic tradition, and the fundamental tension in his theory was never resolved. Neyman and Pearson developed a more consistent theory, but Neyman’s commitment to the frequency theory led him inexorably to a theory of inductive behavior, to a theory of statistical decision-making. This approach found paradigmatic application to problems of quality control in manufacturing but had little relevance for basic research in psychology.
Book
Tension has long existed in the social sciences between quantitative and qualitative approaches on one hand, and theory-minded and empirical techniques on the other. The latter divide has grown sharper in the wake of new behavioural and experimental perspectives which draw on both sides of these modelling schemes. This book works to address this disconnect by establishing a framework for methodological unification: empirical implications of theoretical models (EITM). This framework connects behavioural and applied statistical concepts, develops analogues of these concepts, and links and evaluates these analogues. The authors offer detailed explanations of how these concepts may be framed, to assist researchers interested in incorporating EITM into their own research. They go on to demonstrate how EITM may be put into practice for a range of disciplines within the social sciences, including voting, party identification, social interaction, learning, conflict and cooperation to macro-policy formulation.
Chapter
According to the experimental studies of force patterns and patterns of the formation of microgeometry of the surface layer, a comparative analysis of various methods for constructing multifactor, in the general case non-linear, regression models of machining processes are carried out. Estimates of the modeling error are determined for various methods of initial data normalization, in particular, the relative error and the standard quadratic error of the model. For each multivariate model, according to the value of the F-criterion, the probability value is calculated, considered as a threshold for the adequacy of the model. The same value was established as a confidence probability determining the significance of the factors under consideration. It is shown that obtaining nonlinear dependencies is possible only as a result of preliminary processing of the results of statistical tests, and the smallest relative error and the highest reliability of modeling is obtained as a result of preliminary normalization of the initial data in accordance with the rules of the “Italian cube”.
Chapter
This chapter first introduces correlation coefficients (Sect. 4.2), and then explains the widely-used methods of linear and nonlinear regression analysis (Sects. 4.3, 4.9–4.11). A selection of other methods that are also used to assess the uncertainties in regression analysis are also explained (Sects. 4.4–4.8). All methods are illustrated by means of synthetic examples since these provide an excellent means of assessing the final outcome.
Article
We propose the cyclic permutation test to test general linear hypotheses for linear models. This test is nonrandomized and valid in finite samples with exact Type-I error α for an arbitrary fixed design matrix and arbitrary exchangeable errors, whenever 1 / α is an integer and n / p ≥ 1 / α – 1. The test applies the marginal rank test on 1 / α linear statistics of the outcome vector where the coefficient vectors are determined by solving a linear system such that the joint distribution of the linear statistics is invariant to a nonstandard cyclic permutation group under the null hypothesis. The power can be further enhanced by solving a secondary nonlinear travelling salesman problem, for which the genetic algorithm can find a reasonably good solution. We show that the Cyclic Permutation Test has comparable power with existing tests through extensive simulation studies. When testing for a single contrast of coefficients, an exact confidence interval can be obtained by inverting the test. Furthermore, we provide a selective yet extensive literature review of the century-long efforts on this problem, highlighting the novelty of our test.
Preprint
Today many vision-science presentations employ machine learning, especially the version called “deep learning”. Many neuroscientists use machine learning to decode neural responses. Many perception scientists try to understand how living organisms recognize objects. To them, deep neural networks offer benchmark accuracies for recognition of learned stimuli. Originally machine learning was inspired by the brain. Today, machine learning is used as a statistical tool to decode brain activity. Tomorrow, deep neural networks might become our best model of brain function. This brief overview of the use of machine learning in biological vision touches on its strengths, weaknesses, milestones, controversies, and current directions. Here, we hope to help vision scientists assess what role machine learning should play in their research.
Article
Full-text available
In multiple regression Y ~ β0 + β1X1 + β2X2 + β3X1 X2 + ɛ., the interaction term is quantified as the product of X1 and X2. We developed fractional-power interaction regression (FPIR), using βX1M X2N as the interaction term. The rationale of FPIR is that the slopes of Y-X1 regression along the X2 gradient are modeled using the nonlinear function (Slope = β1 + β3MX1M-1 X2N), instead of the linear function (Slope = β1 + β3X2) that regular regressions normally implement. The ranges of M and N are from -56 to 56 with 550 candidate values, respectively. We applied FPIR using a well-studied dataset, nest sites of the crested ibis (Nipponia nippon).We further tested FPIR by other 4692 regression models. FPIRs have lower AIC values (-302 ± 5003.5) than regular regressions (-168.4 ± 4561.6), and the effect size of AIC values between FPIR and regular regression is 0.07 (95% CI: 0.04–0.10). We also compared FPIR with complex models such as polynomial regression, generalized additive model, and random forest. FPIR is flexible and interpretable, using a minimum number of degrees of freedom to maximize variance explained. We have provided a new R package, interactionFPIR, to estimate the values of M and N, and suggest using FPIR whenever the interaction term is likely to be significant. • Introduced fractional-power interaction regression (FPIR) as Y ~ β0 + β1X1 + β2X2 + β3X1M X2N + ɛ to replace the current regression model Y ~ β0 + β1X1 + β2X2 + β3X1 X2 + ɛ; • Clarified the rationale of FPIR, and compared it with regular regression model, polynomial regression, generalized additive model, and random forest using regression models for 4692 species; • Provided an R package, interactionFPIR, to calculate the values of M and N, and other model parameters.
Article
Full-text available
This research examined the relationship that exists between the between the blood pressure and some demographic fators viz: age , sex and status(alive or dead) of some patients. Ordinal logistic regression was employed for the analysis of the data. The data of 1,966 patients(men and women) were obtained from Ekiti State Teaching Hospital, Ado Ekiti. The result showed that only age is statistically significant of the three factors. Pseudo R-square (Cox and Snell, Nagelkerke and McFadden) values were very low (0.005, 0.006, 0.002 respectively), suggesting that these predictors can only account for a little change in blood pressure patients. Logit models were obtained to calculate probabilities of the various possible outcomes.
ResearchGate has not been able to resolve any references for this publication.