Science topic

Microsoft Excel Data Analysis - Science topic

Explore the latest questions and answers in Microsoft Excel Data Analysis, and find Microsoft Excel Data Analysis experts.
Questions related to Microsoft Excel Data Analysis
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
I am studying leadership style's impact on job satisfaction. in the data collection instrument, there are 13 questions on leadership style divided into a couple of leadership styles. on the other hand, there are only four questions for job satisfaction. how do i run correlational tests on these variables? What values do i select to analyze in Excel?
Relevant answer
Answer
First, you need to do the correlation between your target variable and each of your potential independent variables. After checking what independent variables are the more correlated to your target variable (as mentioned earlier coefficient correlation closest to - 1 or + 1). Once, you decide according to these correlation coefficients which variables you can select for your model, you need to ensure that there will be no multicollinearity in your model. To ensure that, for each independent variable you do correlation tests again. If two independent variables are too correlated, you should introduce only one in your model (e.g. the variable which had the higher correlation rate with your dependent variable).
  • asked a question related to Microsoft Excel Data Analysis
Question
8 answers
I have a dataset of patients with ESRD and want to estimate GFR using the 2021 CKD-EPI formula.
Relevant answer
Answer
Hello Dineo
Here attached the code to calculate eGFR according to the CKD -EPI 2021
gen eGFR01 = .
replace eGFR01 = 142 * (PreopCreatinine/0.9)^(-1.2) * 0.9938^Ageatdx if SexM1==1 & PreopCreatinine > 0.9
replace eGFR01 = 142 * (PreopCreatinine/0.9)^(-0.302) * 0.9938^Ageatdx if SexM1==1 & PreopCreatinine <= 0.9
replace eGFR01 = 142 * (PreopCreatinine/0.7)^(-1.2) * 0.9938^Ageatdx * 1.012 if SexM1==0 & PreopCreatinine > 0.7
replace eGFR01 = 142 * (PreopCreatinine/0.7)^(-0.241) * 0.9938^Ageatdx * 1.012 if SexM1==0 & PreopCreatinine <= 0.7
best regards.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
how to transform my results in the EuroQOL EQ-5D + EQ VAS to get an overall score that can be presented as “good, moderate, poor“ quality of life?
we will use it pre and post bariatric surgery research and it is not the main topic so we would like a short and easy to use questionnaire that is easy to calculate and easy to present
Relevant answer
Answer
EoQ5D is a prescribed structure, we cannot manipulate it. None will accept. In UK some health economics experts are discussed on " we needs to change EoQ5D as with time with transition of disease the method needs to modify", Professor Ric Fordham,UK is one of them.
As answer above use "Visual Analog Scale" if you wants to said QALY.
Additionally, can think about;
Likert scale:
It's a question that uses a 5 or 7-point scale. Typically, the Likert survey question includes a moderate or neutral option in its scale.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
Hey,
I want to calculate the standard deviation for each substituent for two molecules using Excel and then calculate the average of all the values (no S.D).
For S.D I used STDEV.P and for average I used AVERAGE. Is it the right way? Or should I use STDEV.S? or should I calculate range (large-small) instead of average?
Relevant answer
Answer
"R squared is higher" -- compared to what?
"STDEV isn't lower" -- compared to what?
The STDEV of the (absolute) differences between x- and y-values correlates with the correlation between x and y: the higher the correlation between x and y, the smaller the STDEV of their absolute differences should be. But in your case I don't see with what you are comparing your results - you have only a single sample.
  • asked a question related to Microsoft Excel Data Analysis
Question
8 answers
I tried using Gigasheet but it does not have many features that are available in excel. Suggest me some freely available sources where I can load my ~1.7 million rows and do some calculations like sort multiple columns, remove duplicates
TIA
Relevant answer
Answer
I'd do this in R. It's free and sufficiently powerful to hande such tables with ease.
This could be an example work-flow (assuming you have a file named "my_huge_file.txt" which is a tab-delimited text file with many rows and a header row that contains the texts "ID", "value", "name" and "amount" [just for example!] as column names):
# read the file into a data.frame:
df <- read.delim("my_huge_file.txt")
# sort the rows by values in the column "name", then "value":
df <- df[order(df$name, df$value), ]
# remove all rows with dupplicate entries in the column "ID":
df <- df[!duplicated(df$ID), ]
# get the 5 rows with the largest values in the column "amount":
o <- order(df$amount, decreasing = TRUE)[1:5]
df <- df[o, ]
# getting the mean of the values in the colum "value"
# by each value in the column "name":
tapply(df$value, INDEX = df$name, FUN = mean)
  • asked a question related to Microsoft Excel Data Analysis
Question
5 answers
I need to run artanova and tukey-hsd for the interactions among the treatments, but my dataset has few NAs due to experimental errors.
When I run :
anova(model<- art(X ~ Y, data = d.f))
I get the warning :
Error in (function (object) :
Aligned Rank Transform cannot be performed when fixed effects have missing data (NAs).
Manually lifting is not an option because each row is a sample and it would keep NAs, simply in wrong samples.
Relevant answer
Answer
The issue is that you are using art() from ARTool to fit the model and that can't handle missing values. You could use listwise deletion by passing na.omit(d.f) to the art() function - though this would potentially bias results (though no more than using na.rm=TRUE in anova() or lm().
A better solution is to use multiple imputation (e.g., with the mice package in R), though I'm not sure if that works directly with art() models or to use a different approach to handle your data (which presumably aren't suitable for linear models). You could use a transformation, a generalized linear model, robust regression etc. depending on the nature of the data.
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
Dear experts,
Could you please give me a hint on how to depict ranges in a scatter plot, e.g. how to highlight a reference range for data in clinical chemistry, when plotting a time course (see attached image)?
Currently, I'm using LibreOffice Calc and add the reference range manually by drawing a rectangle, but that's tedious to copy/paste into a report when analyzing lots of data, as for each plot, I need to make a screenshot first and then cut and paste the diagram (or select both the rectangle and the diagram before copying). It also would be more convenient and accurate, if I could specify the reference range in the data table or in a form, instead of belly guessing it on the chart.
Any suggestion how I could automate this procedure (e.g. by using a dedicated function and specifying the reference range and possibly a color for it in a data column) so that I just can copy/paste the diagram or even better link to it?
Is there by chance a plug-in or a dedicated program for this purpose?
Thanks for your help!
Relevant answer
Answer
I don't know LibreOffice but Excel. (BTW: Why do you use it instead of Microsoft Office?)
I will answer your question in terms of Excel.
See the attached workbook "Shapes in Plots.xlsx" as an example.
Your method is indeed very tedious and impractical. I suggest two better ways:
A) Manually paint a shape on the diagram.
B) Insert a calculated polygon as another point(x/y) data series into the diagram.
The disadvantage of A is that the shape does not automatically adapt to the data, but must be adjusted manually. But what is simple.
With B, on the other hand, the polygon adapts directly to the data. However, you cannot fill it with a color or a pattern.
In my example (see sheet "Parabola") I use a parabola
y = -ax*(x - xm)^2 + ym
with given maximum (xm, ym) and form factor ax. From this, the zeros are calculated and a table with x/y data is set up.
From this a diagram with the (smoothed) parabola function and additionally the interesting points is created as usual. I want to draw a rectangle between the zeros and the maximum.
Method A is demonstrated and explained on sheet "With Shape". This results in "Shapes in Plots With Shape.png":
Method B is demonstrated and explained on sheet "With Polygon". This results in "Shapes in Plots With Polygon.png":
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
There are few steps to make heatmap of your qRT-PCR data (fold change or relative quantification) using R.
Data file preparation:
Make excel file of your data in which your will place your gene of interest in column and your treatment or conditions in row.
Save the file in *csv extension.
Import data file in R:
By using following codes, import your data file into R,
data2 <- read.csv("data1.csv")
~ data1.csv will be file name your data file your created in excel and data2 is the name of your data in R. You can use your own names instead of data1 or data2 and you can even give your data a single name at both places.
When you will import the data, you will see first column composed of serial numbers. We need to replace the numbers with the names of actual column of your data that contain your gene of interest. To do this use this code:
rownames(data2) <- data2$Name
~ Name is first column
This will replace the serial numbers with your first column. But now you have two columns with your genes of interest. To remove duplicate, use this code:
data2$Name <- NULL
Now your data is ready to create heatmap.
Developing heatmap:
First create matrix of your data by using following code:
data2 <- as.matrix(data2)
Now install a package to create heatmap "pheatmap" by following code:
install.packages("pheatmap")
after installing you will call that package every time when you want to use it by following code:
library("pheatmap")
Then give a command to make heatmap of your data by following codes:
pheatmap(data2)
Usually we show fold change/relative quantification value inside our heatmap to add them modify your code in the following way:
pheatmap(data2, display_numbers = TRUE)
- You can customize your heatmap in many ways. Contact me any time if your any help.
Relevant answer
Answer
Good luck
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
Deal All,
I have two series of time series data that I would to correlate. One data set is the deposits, by month, for a list of different account. The other is the balances, by month, for the same list of accounts. In essence, I have two matrices that I want to understand correlation for without having to strip out each account separately. Furthermore, I want to cross-section that data into different segments.
This is being done with the goal of being able to forecast account balances in the futures, by looking at their usage behavior (assuming there is a lag relationship).
How do I build an intermediate matrix of the correlations? Is there a way to do it in Python or R-Studio? Is there a way to do it in excel?
Thanks
Ryan
  • asked a question related to Microsoft Excel Data Analysis
Question
12 answers
Hi everyone,
recently, I have been working on a study where I examine the impact of American tariffs, customs and other import duties on European exports to the US. I have three variables (y = EU exports to the US, x1 = US tariffs, x2 = US customs and other duties). I use quarterly data from 1995Q1 until 2017Q1 (89 observations). My tutor has emphasized that I need to controll for year and country fixed effects and maybe introduce dummies per year and country. I am quite clueless how to do that. Why is it necessary? What is the equation? How do I do that in Excel or eViews? I would appreciate step by step instructions so much!
Thank you in advance for any help or comments.
Relevant answer
Answer
Rasidah Mohd-Rashid I hope the answer is not late, I suppose you want to add year fixed effects to control for year to year variation, this is a perfectly normal situation for event studies. Ideally, to not have a lot of dummy variables and noise, there should be few effects to be controlled, so having few years of data is fine.
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I can't find a straight answer on google and it's frustrating me. I am trying to indicate that some cells in my data sheet do not have a value of 0, but rather that no data was obtained and the cell should therefore be left out of calculations. How would I achieve this?
Just typing in "No Data" is problematic, since I receive an error when using pivot tables to calculate averages.
Relevant answer
Answer
Am not sure I get what your problems is... If your leave the cell empty, I believe you should be fine.
For example
cell1 = 30
cell2 =
cell3 20, etc.
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I performed a growth performance experiment of microalgae with four treatment. Where I measure cell dry weight (unit: mg/L), cell density (unit: ×10^5 cells/ml), chlorophyll a (unit: µg/ml) and Beta carotene (unit: µg/ml) content of the microalgae. Reviewer suggests me to analyze correlation of cell growth/size with the pigment content (Chlorophyll a and Beta carotene) of microalgae. So, how can I measure correlation of cell growth/size with the pigment content (Chlorophyll a and Beta carotene) of microalgae using Microsoft Excel or other suitable data analysis software? Thank you.
Relevant answer
Answer
Thank you all
  • asked a question related to Microsoft Excel Data Analysis
Question
5 answers
Since it is preferred to check any autocorrelation among the variables; one has to remove highly correlated variables to run an SDM (I am using MaxEnt). For my study, I have calculated the Pearson correlation coefficient (r) among the variables (correlation matrix is provided). But as I am new to this, I am finding it hard to interpret the correlation matrix table. Meaning how and on what basis, I am going to remove the variables? (I am taking the threshold ≥0.8 for the purpose), I need some expert suggestions.
Q1. How the variables are chosen? And please suggest me accordingly to the provided table. Which are the variables I have to select for my study?
Q2. How one variable is selected, when there is a high correlation between two variables?
Q3. Is negative correlation not a problem? I am asking this because; I have seen few papers where highly negatively correlated variables are also selected.
Please help me.
Relevant answer
Answer
You have 20 variables bio1 -bio19 and elev with many of highly pairwise correlated. This means that there is redundant information and uncertainty in the data that makes it difficult to attribute and interpret the contribution of one or more variables. Knowing the pairwise linear correlation coefficients does not help to reduce redundant information and extract meaningful information for separate contributing factors.
I suggest using a principal component decomposition methodology that allows one to perform a multivariate correlation analysis and identifies redundant variables that carry little or no independent information while retaining only a few mutually uncorrelated principal variables (components) that contain practically all original information. This technique is a special case of a matrix approximation procedure called singular value decomposition. The higher the level of correlation between the columns of data of the original matrix, the fewer the number of new (principal) variables is required to describe the original data set.
You can perform principal component analysis using various statistical packages, such as R, Matlab, or Minitab. The latter requires no coding at all.
An overview of principal component analysis can be found in many books on multivariate analysis.
  • asked a question related to Microsoft Excel Data Analysis
Question
12 answers
I want to analyze data of different workshop (approx 40 workshop) participants data and following two are issues:
A. I want to check out of 4000 participants, How many participants attended multiple workshop
B. Separate the participants data based on their designation, Department, Affiliation
Relevant answer
Answer
For a RG discussion on the meaning of the 'online' (or 'on-line') terminology at this forum:
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
I used binary logistic regression to model the behavior of drivers' stop and go decision in dilemma zone. Now I have to estimate the elasticity of variables (corresponding change in outcome probability on changing one unit of X). My variables are of continuous and categorical nature. Any help would be highly appreciated.
P.S: I have uploaded the screenshot of one of the papers where author has calculated the elasticities.
Relevant answer
Answer
FGS try the videos or even z-library for some examples
Best, D Booth
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
Imports System.Data.OleDb
Imports Microsoft.Office.Interop
Imports System.Windows.Forms
Public Class UploadData
Private connString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Users\admin\Desktop\MatrixVisualizationSystem\MatrixVisualizationSystem\database.mdb;Persist Security Info=True"
Private Sub btnBrowse_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBrowse.Click
Dim result As DialogResult = ofd.ShowDialog()
If result = Windows.Forms.DialogResult.OK Then
txtfilename.Text = ofd.FileName ' خزنت اكسل بالtxtfilename
End If
End Sub
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
If txtfilename.Text = "" Then
lblstatus.Text = "ارجو تحديد الملف المراد تحميل البيانات منه"
lblstatus.ForeColor = Color.Red
Return
Else
lblstatus.Text = ""
End If
Dim Excel As String = txtfilename.Text 'تعريف الباث
Dim excelConnString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & Excel & ";Extended Properties='Excel 8.0;HDR=Yes;IMEX=1;';"
Dim excelConnection As New OleDbConnection(excelConnString)
Dim excelCommand As New OleDbCommand("select * from [ورقة1$]", excelConnection)
Dim excelReader As OleDbDataReader = excelCommand.ExecuteReader()
Dim dt As New DataTable() 'dt data table لل داتا قرد
dt.Load(excelReader) ' rows and colبيتعامل معها كتيبل
Dim accessConnection As New OleDbConnection(connString)
Dim accessCommand As New OleDbCommand() ' بياخد كل انواع الي بدي ياها ونا بختار
For Each row As DataRow In dt.Rows ' كل رو"as data row" امشيلي عليه
If row(0).ToString = "" Then ' اذا كان الرو فاضي
Continue For ' لا تكمل وارجع لف مرة تانية حتى تصير ترو
End If
accessCommand.CommandText = String.Format("INSERT INTO Schedule (courseCode, courseNo, SectionNo, CourseTitle, Hours, Days, fHour, tHour, room, roomNo, maxLimit, teacherName) " & _
" VALUES ('{0}','{1}','{2}','{3}','{4}','{5}','{6}','{7}','{8}','{9}','{10}','{11}')",
row(0),
row(1).ToString(),
row(2),
row(3),
row(4),
row(5),
row(6),
row(7),
row(8),
row(9),
row(10),
row(11)) ' تخزين كل البيانات بالاكسس
accessCommand.Connection = accessConnection
If accessConnection.State = ConnectionState.Closed Then ' اذا مغلق افتحه واطلع واذا مفتوح خزن واطلع
End If
accessCommand.ExecuteNonQuery()
Next
accessCommand.CommandText = "select * from Schedule" ' حذف الخلايا الفاضية
dt = New DataTable
dt.Load(accessCommand.ExecuteReader)
gridSchedule.DataSource = dt
If excelConnection.State = ConnectionState.Open Then
excelConnection.Close()
End If
If accessConnection.State = ConnectionState.Open Then
accessConnection.Close()
End If
End Sub
'Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
' ' شغل الدكتور
' ' المشكلة الساعات واسم المساق ما بيعطي الي فيه سترينغ
' Dim xlsApplicaton As Excel.Application
' Dim xlsWorkBook As Excel.Workbook
' Dim xlsWorkSheet As Excel.Worksheet
' xlsApplicaton = New Excel.Application
' xlsWorkBook = xlsApplicaton.Workbooks.Open("G:\123.xls")
' xlsWorkSheet = xlsWorkBook.Worksheets("Sheet1")
' Dim type As Object = xlsWorkSheet.Cells(1, 7).Value.GetType ' 1 row , 7 col
' For i As Integer = 2 To xlsWorkSheet.Rows.Count
' Dim CourseCode As Object = xlsWorkSheet.Cells(i, 1).Value
' Dim CourseNo As Object = xlsWorkSheet.Cells(i, 2).Value
' Dim SectionNo As Object = xlsWorkSheet.Cells(i, 3).Value
' Dim CourseTitle As Object = xlsWorkSheet.Cells(i, 4).Value
' Dim hours As Object = xlsWorkSheet.Cells(i, 5).Value
' Dim Days As Object = xlsWorkSheet.Cells(i, 6).Value
' Dim fHour As Object = xlsWorkSheet.Cells(i, 7).Value
' Dim tHour As Object = xlsWorkSheet.Cells(i, 8).Value
' Dim room As Object = xlsWorkSheet.Cells(i, 9).Value
' Dim roomNo As Object = xlsWorkSheet.Cells(i, 10).Value
' Dim teacherName As Object = xlsWorkSheet.Cells(i, 12).Value
' Next
'End Sub
Private Sub ofd_FileOk(ByVal sender As System.Object, ByVal e As System.ComponentModel.CancelEventArgs) Handles ofd.FileOk
End Sub
End Class
Relevant answer
Answer
My admittedly rough translation to english is for a line of code towards the bottom...within the following lines :
'Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
' ' شغل الدكتور
' ' المشكلة الساعات واسم المساق ما بيعطي الي فيه سترينغ
' Dim xlsApplicaton As Excel.Application
' ' المشكلة الساعات واسم المساق ما بيعطي الي فيه سترينغ
In english this could read as…
The problem is the hours and the name of the course. It does not give me a string
  • asked a question related to Microsoft Excel Data Analysis
Question
9 answers
I would like to calculate the MSD and Deff for each particle (every particle have an unique TRACK_ID in the table).
You could find one of my files attached. 
Thanks a lot
Relevant answer
Answer
Hi Naouel Athmane, did you found the information on how the track manager plugin in icy software calculates the MSD plot?
or anyone knows the information?
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
So I have 2 sets of data, one is around 8700 values and the other is around 5100 values. Both of them are from different setups and I need to find the error between them. I have made the graphs and everything and I need to find the exact value of maximum error in the trend. The data is currently in MS Excel. I know MATLAB and Python, but I am unable to get the value. Is there anyway it can be done?
To explain it further in terms of a graph, the x-axis contains time and the y-axis contains the observed data. I need the time at which the maximum error occurs and the value of that error.
Relevant answer
Hi, Aakash Gupta!
You can compare the chunk in which the number and parameter names match. Try using principal component analysis (PCA). If you throw off the dataset, I'll try to find the time interval of the error, and show it on the graph. I work in python.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
I have 50 survey responses coded as 0 and 1 in a square matrix format pasted on different excel worksheets. I want measure the frequency on 0 and 1 in each cell to identify if 0 is having frequency or 1. What is probable method for it??
Relevant answer
Answer
Use the =Countif function to count the number of times each unique entry appears in the original list.
  • asked a question related to Microsoft Excel Data Analysis
Question
7 answers
In order to analyze daily yield of Solar PV , the master file needs to be split into smaller files. As Excel has limitation of number of rows, it is tough to load or extract the same. Hope there is a good solution or work around is available.
Thank you in advance.
Relevant answer
Answer
There are some free on-line services for splitting large files, why not to try with that... Next, I did a small Python recipe to split text file of any size into multiple chunks:
  • asked a question related to Microsoft Excel Data Analysis
Question
2 answers
I am working on portfolio optimization using lower partial moment of order 1, can someone help me how to implement LPM-1 in excel sheet using "tau" as my threshold value as 0.00% and order (n) as 1.
Thank you all in advance for your contributions to my question.
Relevant answer
Answer
  • asked a question related to Microsoft Excel Data Analysis
Question
5 answers
To dear Researchers,
I was analyzing a series of concentration for estimation of Real-Time PCR efficiency. The concentration was 1:10. I used MS-excel to evaluate Slope. The result of slope was -8 and it was consistence with that in Scatter plot equation. I also drew a line plot. In the line plot the slope was -3.2 and the efficiency was 100%. Also, there was no linearity in Scatter plot but there was a well fitted linearity (0.99) in line plot. I'm not sure what is the problem. Please help me figure it out.
Kind regards
Alireza
Relevant answer
Answer
Ok, had a look.
The problem is the student is plotting Cq (a log2 metric) against concentration (a linear metric), which gives you a ridiculous skew that is impossible to fit a line to. You can fit a log-curve to it (really well, for obvious reasons), but this is not how you do generally do it.
You either linearise your Cq values, or log transform your concentrations (the latter is much easier, and arguably more informative, here).
This (a log/log plot) generates linear data, with (as expected) the gradient of -3.28.
I've attached my edited version.
  • asked a question related to Microsoft Excel Data Analysis
Question
5 answers
I'm looking for an excel sheet to calculate the STS Score. From the website http://riskcalc.sts.org/stswebriskcalc/calculate. Does anyone have a worksheet to calculate the STS Score?
Relevant answer
Answer
I also tried to find the Excel version of STS calculator, but I still didn’t find it. There are more than 60 parameters & it’s convenient to use it on STS site.
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I submitted three sample materials for DMA analysis. Resulting data (frequency, temperature, E', E'', dL, tanD, Ft, time) came out in excel file along with the graphs of all materials from the analyzing software which were also attached inside the excel file in a separate sheet . Upon inspection of the graphs, I noticed that unlike most DMA graphs which presents three smooth curves displaying the storage modulus, loss modulus, and loss tangent, the graphs I received has multiple curves that represent each elements to which I cannot properly interpret.
Furthermore, because I need to make a comparative study of my materials, I am required to plot the same sets of elements into one graph. As I initially plot for the storage modulus, what appeared was three zigzag lines coming to a curve instead of three smooth curves. This same curving zigzag lines were evident also as I plotted for the loss modulus and loss tangent of the sample materials.
Is the excel data and graph I received from the DMA analysis reliable? How do I properly plot the DMA data in excel or, possibly, in another software to achieve desired results? Thank you.
Attached here are the DMA graph of Sample 1, the graph for the storage modulus of Samples 1, 2, & 3, and DMA graph from Lozano-Sánchez, L., Bagudanch, I., Sustaita, A., Iturbe-Ek, J., Elizalde, L., Garcia-Romeu, M., & Elías-Zúñiga, A. (2018). Single-Point Incremental Forming of Two Biocompatible Polymers: An Insight into Their Thermal and Structural Properties. Polymers, 10(4), 391. For your reference.
Scatter with Smooth Lines were used to plot E' (y-axis) and Temperature (x-axis) using Microsoft Excel in the image Storage Modulus of Three Samples.JPG
Relevant answer
Answer
The first step is to use an option in Excel called "export file". Next, you choose one of the possible formats for the exported file. The most commonly used are either text (*.txt) with tab or spaces delimiters , or comma delimiters (*.csv).
The result is tables of data that you can use in any plot program available. As already noted by Oya Tagit, the most suitable program for scientific charts is "Origin". If you do not have it, on the website https://www.originlab.com you can start mastering it from the free version.
  • asked a question related to Microsoft Excel Data Analysis
Question
7 answers
I have 3 data sets (eg Scores from three different Universities) all mixed up in a single excel sheet and with different number of scores. Eg Oxford sample with 4000 scores, Cambridge with 3200 scores, and Portsmouth with 1800 scores and I want to extract the samples,what would the macros look like? Please note the spreadsheet has 2 columns i.e. University and Score. If someone can also answer the same question for SPSS I will also be grateful.
Relevant answer
Answer
Following on from David Eugene Booth 's comment, many statistical procedure allow you to weight the subsamples to take into account different sample sizes and different university populations.
  • asked a question related to Microsoft Excel Data Analysis
Question
20 answers
I am required to do a statistical analysis comparing Country A and Country B in terms of their approaches to minimize or prevent occurrence of COVID-19 in their respective country.
So far these are the information I have gathered:
- Positive cases from COVID-19
- Death cases from COVID-19
- Mitigation measures taken (Descriptive)
- Condition of country: health, economy, social
For now I can only decide to do T-test for death cases. Do you think it is suitable?
Feedbacks and ideas are very much appreciated. Thank you in advance!
Relevant answer
Answer
Any meaningful statistical analysis aimed at comparing the efficiency of two countries in handling COVID pandemic MUST be based on dynamic data (daily or weekly) over a long period rather than on one-time data collected for one particular day, such as 14th August 2020. No meaningful analysis can be performed using only one-time data.
  • asked a question related to Microsoft Excel Data Analysis
Question
2 answers
Dear colleagues,
do you have some EXCEL template for calculate GINI coefficient and create Lorenzo curve? I find some R packages but I am not sure how to use it on my date with population and income by region.
Relevant answer
Answer
you can see this document to understand the gini formulas
  • asked a question related to Microsoft Excel Data Analysis
Question
1 answer
Hi everyone, as stated above, I would like to know whether it is possible to export data of SpO2 and Nasal Flow in .csv or .xls(x) format from ApneaLink for further analysis.
Thank you in advance, best regards.
Relevant answer
I'm interested in this question too!
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
Is it possible to do Central composite design and Response surface method when there are two independent variables? Say, the first independent variable represents 5 graded levels of a fatty acid and the second independent variable represents 2 temperature and I would like to see their interaction effect on fish.
Please provide me suggestion how it can be analysed in Excel or SPSS.
Relevant answer
Answer
Thank you Professor Vallejo. If I will reduce the levels in the first variable it will not show any significant response. But I will increase the levels in 2nd variable.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
Hi all,
I'm looking at accelerometer data. I used two accelerometers at once for a backup, but now I need to correlate the two datasets provided. I'm only looking at one axis on each, but one set of data has ~5000 more data points than the other. This may sound like a lot, but the datasets on average hold about 215,000 points of data. During data collection, I paired the two so they began collecting data and ended at the same time, but still one observed more movements than the other. Is there anyway around this to compare the two to say that both pick up about the same type of data than the other? A correlation test in R isn't working since there are two different amounts of data in each column. I do, however, have the mean and SD of each set.
Thanks so much in advanced.
Relevant answer
Answer
Lacey Valois , if your time series are simply recorded at two different frequencies and are thus not synchronised, you can for example use the Hayashi-Yoshida approach to compute a covariance, without removing any observation:
Hayashi, T., & Yoshida, N. (2005). On covariance estimation of non-synchronously observed diffusion processes. Bernoulli, 11(2), 359-379.
In addition, if you have some doubts on the accuracy of the times of the recording, you can increase the time step when computing the correlation (taking for example into account one observation in two). By doing so, you should see an increase of the correlation (known as Epps effect).
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
I am doing research for my thesis which is comparing the genetic variants (genes, SNPS, etc.) of 4 medical conditions. I have identified approximately 7,000-10,000 genetic variants that I would like to analyze, compare, and run a statistical analysis on. I would like to complete my thesis in this lifetime, so manual entry is probably not the best option nor is it the most accurate (due to human error). If anyone has any techniques that would work, it would be wonderful. Though I am comfortable with EXCEL, I have access to SPSS statistical software, maybe export into SPSS is easier/possible? Thank you.
Relevant answer
Answer
Thank you.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
Hi,
I want to calculate the correlation of the Bitcoin price with some other asset classes such as gold or oil. Therefore, I use the daily returns in percent of each asset and simply apply the correlation formula in excel. To be sure I got me two different datasources for oil and gold (Krugerrand and H&H gold, Texas and London Oil). Obviously the historic prices of Krugerrand and H&H are almost the same, the same applies for the two different kinds of oil.
My question/problem is: The correlation of the absolute values from Gold 1 - Gold 2 and Oil 1 - Oil 2 is close to one which makes sense. The correlation of the returns, however, is close to zero, i.e. non existent. How can that be? shouldn't the returns be at least very positively correlated as well?
Furthermore, the correlation of each time series with Bitcoin differs, even though I am thinking that Oil 1 and Oil 2 / Gold 1 / Gold 2 should have the same relation to Bitcoin.
I am trying to figure this out since days now and I am running out of time for this project. I would really appreciate if anyone has an idea what I am not seeing here.
I attached the excel, the most relevant sheets are highlighted in red.
Relevant answer
Answer
The calculation of the correlation is done based on the time invariant data pairs (xi,yi). For the calculation of the returns the derivative between 2-time steps is considered, right. This is in principle something different and here we see a nice example that this can lead to interesting results.
The correlation of the day by day return is zero. Conclusion: there is no intraday correlation between these two values.
But there might by a correlation between day i and day i+n. So try to correlate (xi,yi+1), (xi, yi+2),… and you will observe an interesting result 😉. Hope that helped – and good luck.
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
Is there an Excel implementation of the Douglas Peucker algorithm?
Relevant answer
Answer
Some improvements can be made to Ian's solution especially if you are working with students learning programming and/or VBA.
  • Define classes for points and polylines
  • Instantiate variables of those classes to avoid using variants (may lead to memory improvements)
That might be overkill for business analysts or people who just need something done in Excel, but an important step (I think!) for people wanting to learn better coding skills that may be applicable to other languages.
Suggestions welcome!
  • asked a question related to Microsoft Excel Data Analysis
Question
16 answers
In order to analyze the sentiment on downloaded data from social media portals (such as Facebook, Tweeter, LinkedIn, but also YouTube, Instagram ...) and aggregated in Big Data database systems, it is necessary to use specialized software for extraction and analysis of these data.
The quality of the data transferred to, for example, Excel sheets depends on the quality of the extraction process carried out with the help of specialized software.
Then, the quality of data analysis software in Excel sheets or in systems of computerized analytical platforms depends on the result obtained, the answer to the question given to the collected, initially unstructured data in the Big Data database system.
In the future, artificial intelligence may be used for this purpose, and the whole process of purposeful analysis of collected data will proceed in a much more effective, automated manner, less probable errors, will be a cheaper research process and will be carried out much faster even on much larger information collection than current.
In view of the above, the current question is: What will the directions of development of analytical processes carried out sentiment analyzes on data collected in Big Data database systems in the future?
Please, answer, comments. I invite you to the discussion.
Relevant answer
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I have a data set comparing the accumulated biomass on two types of substrate. The number of samples from each substrate is different, however low (n<10).
What would be the most appropriate test to show significant differences between the two substrates?
I've tried PERMANOVA v7 on euclidean distance resemblance matrix, but it seems a bit too much for such a small sample size.
Suggestions anyone?
Relevant answer
Answer
It really depends what the nature of the data is, and what hypothesis you want to test. For example, a t-test may appropriate, or Wilcoxon-Mann-Whitney, or Mood's median test, and so on.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
How to Derive Coefficients of the plynomial Eq. via interpolation calculations of hourly measured values of Net Power over Measured solar Data in Excel ?
Relevant answer
Answer
Hello Haseeb Shah
Consider that the data [x1 x2 x3 x4 x5 x6 x7] are in columns A2:A8
and the data [y1 x2 y3 y4 y5 y6 y7] are in columns B2:B8
and we wish to fit the quartic polynomial y = a*x^4+ b*x^3 +c*x^2 +d*x + e
to the two data sets
The command to obtain the coefficients a, b, c, d, e is
LINEST(B2:B8,A2:A8^{1,2,3, 4},TRUE,FALSE)
To determine coefficients of a a pentic plonomial we should write ^{1,2, 3, 4, 5} instead
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
Hi,
Similar to Matlab xlswrite(filename,A), what is the function name in Scilab to write on a new excel file ? I tried searching, but couldn't find out. There was only write_csv available.
Thanks,
Praveen
Relevant answer
Answer
Hello Praveen,
Scilab has the opportunity with xls_read and xls_open to import files from excel. By using write_csv you are able to write data in a an excel file.
With best regards,
Thomas
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I need to transfer four adjacent cells from Table 1 to Table 2 based on a single cell value with only some of the words in the cell matching. Can anyone please help with a formula/function/script? Many thanks!
Relevant answer
Answer
Index-Match can get multiple columns.
You can add a third sheet, and paste this formula in all the cells (A1:D20) assuming you have the names in Table 1 range A1:A20, and the cells to be matched at Table 2 range A1:D20.
The formula is:
=INDEX('Table 2'!$A$1:$E$20,MATCH('Table 1'!$A1,'Table 2'!$A$1:$A$20,-1),COLUMN())
I attached an excel file with the formula.
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
Hello,
I'm trying to graph some data and can't figure out how to make the graph I need in Excel.  I have a single DMSO control and then a dose response series for several compounds.  How can I graph this data without getting a huge gap between DMSO and Compound A.  Also, how do I get DMSO off of the legend entry?  The excel file is attached.  Tips for Excel or Prism 6 would be helpful.  Thanks!
Relevant answer
Answer
Thank you so much, Alexandra!
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
How can I specifically directly import any column or row from Excel file saved at some location into sciNotes?
How can I interface scinotes and Excel?
Relevant answer
Answer
You can do this easily by adding a Table at SciNote project (say to the results), then copy data from Excel and paste to it. If you want to copy larger data or a full Table, you can Import it by "Add file" button, or you can go to "Samples" Menu, then "Import" Button, and point to your excel file. (The excel file should include header row with data names followed by the data)
  • asked a question related to Microsoft Excel Data Analysis
Question
7 answers
Hey all, 
I do calcium imaging on dissociated neurons and am currently experiencing an issue with my data analysis. 
Essentially the computer I use to image these neurons will take data points  every 2 seconds. This computer has a limited amount of processing power and ram so when I am imaging a large amount of neurons at a time, it is unable to image these neurons precisely at every 2 seconds. My x values will look like  x=2.12, 4.14, 6.19, 8.41 etc. instead of x=2.00,4.00,6.00,8.00 etc.
The problem is that I need to overlay the data sets to create a mean data set of all the neurons I've imaged. However, I am unable to create a mean data set on GraphPad Prism because the x values of the individual neurons do not 'match up'. 
I'm looking for a solution on Excel to make the x values of the recorded neurons 'match up'. I can see either 2 solutions: (A) binning all the x values of a certain range to a certain number. Ie. Changing the values of all x values that fall between 3.00-4.00 and 4.00-5.00 to 4.00 The other solution (B) would be to predict the y values at a certain x values based on the y values at other x values. Ie. If I have values at x= 2.12, y= 5.00 and x= 4.14, y= 7.00, using these known values to predict what y would be at x=2.00. 
I've attached a couple of screenshots of the excel files I work with to hopefully clarify what I am trying to do. Each sheet corresponds to a different plate of neurons that was imaged. The values in the A column are time (the x values) and every other column corresponds to an individual neuron on that plate and their calcium level (y value). 
Thanks for your help. 
Relevant answer
Answer
Joseph, the resolution of screenshots of the spreadsheets is (more than) correct (3600x1080) but RG doesn't scale those images correctly. Download them to your computer and open in any image viewer.
I'm not convinced if HISTOGRAM will do what is required by Sebastian.
  • asked a question related to Microsoft Excel Data Analysis
Question
8 answers
I have large number of precipitation daily data in a column (for 53 years).I need to convert it to monthly sum. How can I do that? The data available format and required format are attached in this sheet. Kindly guide me
Relevant answer
Answer
Dear Muhammad,
I use R statistical program. Although I don't know if you use R, I am posting a script I have developed which gives sums of monthly rainfall from your data. It may be useful for others also who may need a quick way of doing this. It is appended/pasted below..
setwd("D:\\IMTR\\AMTC16_STUDENT PROJECTS\\Victor_Bible") ## set your own directory/path in your computer drive where you saved your data--Data_solved--
yourdata<-read.csv("Data_solved.csv",header=TRUE) ### Note and Ensure you save the Data_solved file in comma seperated (.csv) excel format
head(yourdata) # to view your data
##we are interested in Column 2, 3 and 4 aka Year, Month and Value
# You can use the function tapply, list and sum to get your solution as indicated below
monthlysum<-tapply(yourdata$Value, list(year=yourdata$Year, month=yourdata$Month), sum) # this function sums the dailies into monthly totals in each year
monthlysum<-as.data.frame(monthlysum) # convert to a table
head(monthlysum) # view the out put
write.csv(monthlysum, "Monthlyraintotal.csv") # rename and save the final out put in the working directory as "Monthlyraintotal"
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
Data is in the form of likert scale 1-4 with 1=highly unimprtant and 4=highly important
6 parameters
Relevant answer
Answer
Hi,
I am attaching another article with the formulations that I used in the Excel file.
You can see there how CI and CR are calculated.
They can be 0 when consistency is perfect (like in the initial matrices when all the choices are 1). Usually in real life choices it will not happen.
BR
Elli
  • asked a question related to Microsoft Excel Data Analysis
Question
10 answers
How can I illustrate a forest plot in excel for 2 groups? I would like to do meta analysis for each publication, each with a different technique.
Relevant answer
Answer
Anyone with a similar question might also want to consider Meta-Essentials, a set of free and simple workbooks for Microsoft Excel. Among many other features, it also produces forest plots with subgroups. Have a look at www.meta-essentials.com if you like.
Good luck!
Henk
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
Hi,
I am Writing a thesis about predicting abnormal stock returns based on sentiment analysis of tweets.
More specifically we have a huge datasets of tweets, corresponding to a randomized sample of about 1% of all tweets during a year.
Now, we want to sort out the tweets mentioning the companies in the index we are looking at, which is EURO Stoxx 50.
We now want to sort or dataset for tweets containing any cash-tag ($) for our companies. For example AztraZeneca will be $AZN for their ticker symbol. So for this index we will filter for a list of any of 50 cashtags. How can we do this? Preferably in excel.
I enclosed a Picture of how our spreadsheet looks like, as well as a sample of the dataset.
KR
Benjamin
Relevant answer
Answer
I would say this is difficult to do in Excel and quite convoluted. You can do this simply using python by downloading your data as a csv. You can read your data in pandas, then just use regular expressions to look for these names. I can help you with this. Feel free to contact me by leaving a message on my site(arindampaul.me)
  • asked a question related to Microsoft Excel Data Analysis
Question
2 answers
A simple example for 4 P facilities and 20 demand points will straight thing out for me
Relevant answer
Answer
Do you mean a problem like http://www.ualberta.ca/~cneuman/mgtsc461/Lectures/Lecture%204%20Sep%2030/lecture4a.ppt (slides 4, 5 and 7)? Just define it as a binary linear problem in the way compatible with Solver.
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
Attached is an example of my problem. The different names listed in Column A need to be the headings of columns in a new table, with ALL the corresponding values listed under the appropriate headings in the table. Can anyone help? I can't figure it out although it seems like such a simple problem.
Relevant answer
Answer
@Gerhard
Q: All of "January" values in just one column (I mean 12 months)?
Ans: the Kleomenis`s suggestion works well.
or several column`s of January (According to the length of your data)? 
Ans: May "Pivot table" be a solution.
  • asked a question related to Microsoft Excel Data Analysis
Question
11 answers
My dataset has 32 rows and 13 columns containing monthly rainfall data of 31 years. (The sample file is provided herewith).
Relevant answer
Answer
Hello, I'm not sure what you mean by a "dataset containing separate no. of column and row". R doesn't worry about this. This new problem looks the same as the first one posted.
Read the data from a file into a table:
> data<-read.table(file="c_pr.csv",sep=",",row.names=1,header=TRUE)
The row labels from column 1 of the .csv file become row names in the R object "data" so its elements are purely numerical values. That is, there are no columns of metadata, only columns of data.
The analysis with the zyp package follows by
> test.output<-zyp.trend.dataframe(data, metadata.cols=0, method="zhang",conf.intervals=TRUE, preserve.range.for.sig.test=TRUE)
and is viewed by
> test.output
  • asked a question related to Microsoft Excel Data Analysis
Question
11 answers
I have two datasets which are outputs from a model. Each dataset contains an array of independent variable X and 10 dependent variables Y. I have tried to create 4 different visualization options in Excel, MATLAB, Mathematica & ENVI IDL (Check attached image). However, I would like to know if there is a better graphical/data visualization alternative. The first plot in the image was created in Excel with the simple 'scatter plot with smooth lines' option. The rest 3 options were plotted in MATLAB, Mathematica and IDL programmatically. The Excel and IDL plots look similar. MATLAB and Mathematica plots share the same characteristic look. I am trying to find a way to present the differences clearly between the two Scenarios A & B graphically. I would appreciate if someone can suggest a better statistical/graphical/visualization/plot option.
Following are my datasets:
SCENARIO A
x= [0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
y1= [0, 0, 0.02, 0.01, 0, 0, 0, 0, 0, 0, 0, 0]
y2= [0.01, 0, 0.05, 0.1, 0.19, 0.6, 0.87, 1, 1, 1, 1, 1]
y3= [0.02, 0.05, 0.2, 0.69, 0.99, 1, 1, 1, 1, 1, 1, 1]
y4= [0.02, 0.12, 0.25, 0.97, 1, 1, 1, 1, 1, 1, 1, 1]
y5= [0, 0.12, 0.68, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y6= [0, 0.2, 0.84, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y7= [0.01, 0.49, 0.97, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y8= [0.01, 0.51, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y9= [0.01, 0.82, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y10= [0, 0.84, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
SCENARIO B
x= [0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
y1= [0.01, 0.03, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0]
y2= [0.01, 0.07, 0.04, 0.13, 0.23, 0.5, 0.92, 1, 1, 1, 1, 1]
y3= [0.01, 0.03, 0.2, 0.61, 0.99, 1, 1, 1, 1, 1, 1, 1]
y4= [0.02, 0.06, 0.4, 0.99, 1, 1, 1, 1, 1, 1, 1, 1]
y5= [0, 0.24, 0.61, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y6= [0, 0.26, 0.88, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y7= [0, 0.51, 0.99, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y8= [0.02, 0.64, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y9= [0.02, 0.87, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
y10= [0.01, 0.94, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Relevant answer
Answer
R (https://cran.r-project.org/) has a lot of options of plots vizualization. There is R package for Matlab and others.
  • asked a question related to Microsoft Excel Data Analysis
Question
54 answers
I use Microsoft Excel for numerous spreadsheets for clinical trials. Anything from enrollment logs, reimbursement to the subjects, reimbursement from the sponsors, lists of potential subjects, etc. I'd hate to think how much manual data I would have to capture without it.
Relevant answer
Answer
Dr. Mohammad,
You`ve provided a comprehensive list. Let me share another interesting ones.
QURʾAN in Excel.
  • asked a question related to Microsoft Excel Data Analysis
Question
11 answers
Dear friends and colleagues!
Help me, who know how to write in Microsoft Excel formula for calculating the binary logistic regression.
Calculation is needed is a set of formulas in cells of Microsoft Excel, in extreme cases - the script VBA.
Do not offer to use R, SPSS, MathCad and others.!
Using this method, I want to improve the model for the prediction of the activity of natural focal infections, and the model is constructed in Excel and rewrite it to other platforms is no reason.
If someone gives me a suitable option - will co-author in all publications, made using this method.
Relevant answer
Answer
Dear, You may find useful information in the attached link
  • asked a question related to Microsoft Excel Data Analysis
Question
7 answers
I am doing a thesis on M&As, and I have to check for significant differences in the operating performance of cross-border and domestic M&As and compare the same. Ideally, I would compute the median pre- and post- acquisition performance of 2-3 years before and after the takeover, and arrive at the median pre- and post-acquisition performance. However, I would then need to compute the Median Difference from the median pre- and post-acquisition performance and derive it's statistical significance. 
I am confused on how to go about it using Excel's T-Test. When I select the variables, I am made to select the output range. I do not understand how the author of the paper I am referring to has done it (I have attached a screenshot of the author's method).
Your inputs are appreciated.
Relevant answer
Answer
   
@NaIr; You can use Mann-Whitney Test. It's available on SPSS.
  • asked a question related to Microsoft Excel Data Analysis
Question
8 answers
I have some data that I need to find 50% value for. The chart consists of the period (days) as X axis values and percentages as Y axis values. I was trying to find a way to determine the 50% value on a trend line for each treatment. Is there any easy way to do that without going through modifying the formula and applying it for each treatment?
I attached a pic for one of the treatments I want to find 50% value for.
Thank you in advance for your assistance.
Relevant answer
Answer
 Dear Mustafa,
A direction in MS-excel!
The BLUE line:
INTERCEPT= INTERCEPT(known_y's,known_x's)
SLOPE= SLOPE(known_y's,known_x's)
To estimate Y=50 use:
= (50-INTERCEPT(known_y's,known_x's))/(SLOPE(known_y's,known_x's))
For the RED line, follow as above.
I do hope it will be what you want. (an example is attached).
Raoof
  • asked a question related to Microsoft Excel Data Analysis
Question
7 answers
I want to quantify the change in protein expression from cell lysates by western blotting. But first, I need to find the linear range of the signal. In order to do that, I loaded increasing amounts of protein, blotted them with appropriate antibodies, measured the signal, and plotted the results in Excel as a scatter plot.
As expected, more protein gives stronger signal. With low protein loading (0-10 micrograms), there is a proportional response, but with higher protein loads (> 20 micrograms), the signal flattens out. There is a plateau because signal detection is saturated.
I want to fit the data to show the plateau more clearly. How would I do this? I also want to find the linear range of this curve. Do I need to fit a second, straight line?
Relevant answer
Answer
The following are a couple of excellent articles on how to do non-linear fitting in Excel. These will allow/teach you to use any formula you want for the fit, so you are not limited by build-in formulas.
Kemmer, G. and S. Keller (2010). "Nonlinear least-squares data fitting in Excel spreadsheets." Nature Protocols 5(2): 267-281.
http:\\dx.doi.org\10.1038/nprot.2009.182
Brown, A. M. (2001). "A step-by-step guide to non-linear regression analysis of experimental data using a Microsoft Excel spreadsheet." Computer Methods and Programs in Biomedicine 65(3): 191-200.
http:\\dx.doi.org\10.1016/S0169-2607(00)00124-3
Note that if you find the appropriate formula to fit your calibration curve, you are not limited to the linear part of the calibration curve. Of course the error will be higher the closer you get to the plateau.
  • asked a question related to Microsoft Excel Data Analysis
Question
2 answers
I am facing some difficulties to organize the exported Excel Table to analyze further into T Align software. Does anybody help me to guide how to do it? I have given a sample table, which column I will take for analysis into T Align software?
Relevant answer
Answer
Mohammad, try the attached file. Basically I removed all unnecessary columns then re-ordered the TRF's from lowest to highest. You only need three columns (sample ID, TRF, area). You should probably remove rows based on the size standard that you used...typically you will remove data from trf's below 100 and above 500. Keep in mind that your first TRF sample point has to be in the B3 cell (column B row 3). You can add data from another sample in the same file, just skip a row and continue with the same format. Hope this helps.
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I only have field measured values of solar radiation, air temp, relative humidity, dew point, air pressure, wind speed and direction.  I do not have any information on ratio of clear/cloudy time and I'm having a difficult time arriving at the necessary emissivity values needed to calculate incoming longwave rad. 
Thank you for any help. 
Relevant answer
Answer
Dear  Jessica,
I think, with the three answers you have received so far, you have found a clue foryour question. If you have some data relevant to the surface net radiation, you might find another simple solution for your question. As you know, surface net radiation is the subtract of Net Shortwave (SWi-SWo) radiation from Net Longwave radiation (LWi-LWo). Rnet=(SWi-SWo)-(LWi-LWo). LWi can be determined from the relationship, if you access above radiation components.
  • asked a question related to Microsoft Excel Data Analysis
Question
14 answers
I am doing RT-PCR for my experiment. I have CT value and after that I want to calculate fold difference between treatment and non-treatment.
If somebody has an Excel template, could you send to me?
Thanks.
Relevant answer
Answer
i have one such template. i hope it helps.
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
many are written formulae for testing reliability
but i have not seen any formulae for test of validity
Finding validity is essential for any research
Relevant answer
Answer
Rayala, maybe then you need to formulate your question more precisely - or provide more context - so those who want to help you can give you a more specific response.
  • asked a question related to Microsoft Excel Data Analysis
Question
76 answers
Exporting the list of papers to Excel allow you to sort papers and delete duplicates
Relevant answer
Answer
I was looking for a method to do something similar. I found two workable solutions.
The best was to use the Publish or Perish software (http://www.harzing.com/pop.htm). It cycles through the pages of a Google Scholar search results list and copies the basic information for each result to a results list that can be copied in CSV or Excel format.
The other method was to use Zotero (www.zotero.org) as an add-on to Firefox. You can use it to save the results on a single page of a Google Scholar results list (maximum 20 items). The drawback of this is that there doesn't seem to be a way to cycle automatically through the entire results list.
  • asked a question related to Microsoft Excel Data Analysis
Question
27 answers
I am currently plotting some graphs on excel. I would like compare the expression of a certain gene in different conditions. I made my calculations on excel and ended up with a normal column graph of the normalized mean + SEM of the gene expression under the different conditions. If you see on the picture I have attached, they easily show if the difference between any of the column bars is statistically significant with a nice line connecting the two bars and the annotation of whether P is smaller than 0.05. How can I annotate my graph in a similar way (in a way in which I can clearly see which bars are compared with t test)?
Thank you in advance
Relevant answer
Answer
There is actually nothing really impossible in Excel, but many things are really really hard to do with Excel in an automated way (adding fake data serias to show the connecting lines, and more fake data series to show the p-values; calculation of the correct positions is cumebrsome and the correct formatting is hell). I would add the extra annotations simply as graphical objects (lines and text), either in Excel itself with a graphics software in the exported graphic file. (Many people here do it in Powerpoint, but that's not my favorite because it gets messy when you have to prepare a figure for publication in a given file format (say, tiff), size, and resolution... this is better done in appropriate software like Photoshop and alikes).
Regarding the "normalized gene expressions", and without knowing much about your specific experiment and data: I hope you are aware that such measures ([relative/normalized] expression) are usually not normal distributed, and that therefore "mean", "SEM" and "t-test" are usually not sensible. Often, it is reasonable to assume a log-normal distribution here, so you should do all your statistics on the log-expression values (where you can sensibly calculate means and SEMs and t-tests).
Also note that a bar plot is the (unfortunately most common but) worst possible way to present expression data. If your y-axis is "expression", note that the uncertainty is not symmetric around the location measure, so showing only one-sided error bars would be wrong (or at best incomplete). In any case, make sure that the bars extend to a meaningfur reference value (this may or may not be 1 for "expression" and 0 for "log expression", depends on the normalization and meaning of the data). A far better alternative is to show dots instead of bars (either a single dot/symbol at the position of the location measure or showing the complete data with each dot representing an individual measurement. You can then still use error bars to indicate SEMs or (better!) confidence intervals. Wheras bar plot highlight the absolute (mean) values compared to the reference (where often simply 0 is used, as in the attached example), the dot plots highlight the differences between the groups, what is typically the main research question. If you have >20 or so values per group you could also go for a box(-and-whisker-)plot (unfortunately not really available in Excel; workarounds and tricks known to me fail when values are negative). I attached an example showing the same data as "1d scatterplot" (individual data), "dot plot" (with error bars) and "box plot".
  • asked a question related to Microsoft Excel Data Analysis
Question
2 answers
How accurate is the relationship given by Excel's trendline? If there are several parameters or variables involved in the relationships and it is preferable to solve the relationships one by one rather than putting a whole bunch of data into SPSS a certain form of end equation is required. How can it be done using Excel?
Relevant answer
Answer
You can install and use the data analysis "add in" and use the regression command.
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
I have an excel table which includes more than 10 columns and 600.000 rows. I don't want same rows which includes same things in columns.
I tried (In LibreOffice and Open Office)
CtrlA>data>filter>standard filter
Column A = not empty AND
Column B = not empty AND
....
More Options>No duplicates
I think it was supposed to be done, but nothing happened.
When I tried it with 2 or 3 columns, it worked. But is there any way to do this stuff with more columns?
For now I solved the problem by a temporary solution but I need to learn the real solution.
Relevant answer
  • asked a question related to Microsoft Excel Data Analysis
Question
5 answers
I am working on my thesis for my undergraduate degree and I am meant to carry out a load forecast on my school's data. While collating the data for my forecast, I noticed that some values were missing. What method can I use to fill in these values to increase the accuracy of my forecast? I am using the regression analysis method and also working with Artificial Neural Networks. I am working with MicroSoft Excel.
Relevant answer
Answer
hello
you can used the artificial neural network ANN tools. it is very good for normalize and regression.
also you can transform this black box tools to an open tools by used the wight from ANN with Excel sheet. every think in this way it is clear .
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
I have been asked to apply this method in a set of results I have for a due paper, but I do not really know much about statistics, so I am at an impasse. Could someone help me?
Relevant answer
Answer
  • asked a question related to Microsoft Excel Data Analysis
Question
4 answers
I cultured a bacterium in 8 different culture conditions to find the VOCs cohort of this bacterium in different conditions. I am wondering if you can tell me the best way of data analysis by using SPSS. 35 VOCs (peaks: retention time) have been identified per each condition and I want to compare them to find the similarities by using SPSS.
Relevant answer
Answer
Hi Mohsen,
If I understand you correctly you cultures 1 bacterial strain under 8 conditions and now you want to compare these conditions for 35 VOCs (this is quite little for GCMS analysis of a culture, how did you do peak detection, alignment and matching?).
Normally you would just use a kruskall wallis (dependent: multiple groups, independent: not-normally distributed variable; voc concentration) and the ones that have a p-value below 0.05 you can perform post-hoc tests (you can choose this in SPSS). However, since you test multiple vocs you will have to take care of multiple comparison by lowering alpha (p-value treshold).
Several additional questions arise: how many duplicates per culture condition do you have, this effects the effect size you can discover. Can you convert the 8 categories into subcategories? This increases power etc.
Could you provide additional information?
Lieuwe
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
How can we avoid initial number of exponent in Y-axis of column graph such that the axis only have base 10 without decimal number as attached example?
Relevant answer
Answer
Totally agree with Clementine., but with small modification,
if you plot y vs. x,
then convert y to Y=log(y)
then plot the new Y vs. x
  • asked a question related to Microsoft Excel Data Analysis
Question
6 answers
And population standard deviation?
Relevant answer
  • asked a question related to Microsoft Excel Data Analysis
Question
16 answers
I am looking for a comprehensive question to make a reasoned evaluation of a post graduate student's ability to working with MS EXCEL.
Relevant answer
Answer
Dear Slawomir,
Thanks for your useful links.
In your opinion, can we summarize the several questions in just ONE?
  • asked a question related to Microsoft Excel Data Analysis
Question
5 answers
What do you think, is the drawing error-bar in excel only based on standard deviation or should we divide it by the square root of iterations?
Relevant answer
Answer
Agree with Justinsir:
First one - STDEV indicate standard deviation - a deviation within observations of a particular group from the mean value, while
Second one - STDEV/SQRT (number of iteration) - indicates SEM - Standard Error of Mean - It is the standard deviation of the sample-mean's estimate of a population mean.
  • asked a question related to Microsoft Excel Data Analysis
Question
7 answers
I thought bar graphs with different colors for each group, but Excel nor STATA include this option.
Relevant answer
Answer
This is a stripplot, done in Stata, with superimposed boxplots. The stripplot command, written by Nick Cox, will also superimpose means and confidence intervals (these can be calculated as normal, binomial or poisson). You can install it in Stata by typing
. ssc install stripplot
  • asked a question related to Microsoft Excel Data Analysis
Question
117 answers
I have recently done a workshop on Tableau Software of Tufts University for Microsoft Excel Data Analysis. It gives very good graphs etc.
Relevant answer
Answer
Excel is quiet a powerful instrument often used in biomedical research as a support for datasets. It does contain some basic statistical analysis and could serve for simple inferences.
I would however not call it a "statistical software" as it is not designed for such a use. Also, I would not recommend someone to use Excel for statistical analysis for the following reasons:
1. It is difficult to know what Excel is truly doing when applying a formula.
2. Excel has weird ways of handling missing data.
3. Programming analysis in Excel is complicated and mistakes are easily passed unnoticed.
There are many easy to use free statistical packages available. R might be more difficult to manage for a beginner, so I would try out the following first:
2. OpenEpi http://www.openepi.com (online)
In epidemiology, once you decide you seriously want to analyse your data yourself, I would suggest taking a course and using a statistical package such as those offered by the following software:
1. STATA (licensed)
2. R (open source)
3. MATLAB Statistics toolbox (license, expensive, but quality support)
4. SPSS (yearly license)
5. Statistica (licensed, user friendly)
  • asked a question related to Microsoft Excel Data Analysis
Question
9 answers
If I have values in column A in a spreadsheet starting from A1 to A5 as for example 1, 2, 3.5, 4, 6 and I need to extract all values less than 5 in a new column, what is the best method? And how do I set the auto extraction if the value 5 is in for example in cell C1. That is mean when I changed the value in C1 the values extracted in the new column will changed according to the new value in C1.
Relevant answer
Answer
The easiest way is with IF. You can use the formula that Ehsan gave you but if you don't want any zeros, just replace 0 by "". It would give:
=if(A1<5,A1,"")
If you also want nothing when the cells are empty in the column A, just add another IF for that:
=if(A1="","",if(A1<5,A1,""))
  • asked a question related to Microsoft Excel Data Analysis
Question
22 answers
Excel is so very basic; SigmaPlot and SPSS ok, but again no support for simple non-parametric plots, for eg. Box and Whisker. Wondering what others are using?
Relevant answer
Answer
Hi Felix, short answer: R. It's free, very flexible, good plotting possibilities, very dynamic with a growing user base and transparent developments. Box-Whisker plots are not a problem at all. In R, there are many packages in which commands are organized according to a subject etc., so you will find many packages with which to make them. Yes, it is command line-based (except for yet another package which offers a modest user interface), but there are so many resources out there already, many free to download, plus extensive searchable email list, that doing something like a box plot is no problem at all. As far as I know, you can basically do everything in R that you do with SigmaPlot, SPSS, Excel, and in a much more flexible, and often more correct, way. Best, Susanne
  • asked a question related to Microsoft Excel Data Analysis
Question
3 answers
Water geochemistry
Relevant answer
Answer
See if any of the advices here will help http://peltiertech.com/Excel/Charts/ChartIndex.html. The word collins did not appear, but charts can be called many things.
  • asked a question related to Microsoft Excel Data Analysis
Question
29 answers
I'm looking for a software to create a transparent forest plot graph of Odds Ratio. Do you know any recommendable programs?
Relevant answer
Answer
Metafor package in R
metan in STATA
openmetaanalyst a free software from Tufts Medical Center