Questions related to Social Network Analysis
I'm curious can it be possible to study ego-networks of a person/organization/bank/city etc. via using ERGM and SAOM? Which one is better for making a research? Can you recomend some papers on it?
RG now has more than 25 millions of users worldwide as announced at
I intend to use social network analysis (SNA) as a research method for my study. However, I could not get straightforward resources using social network analysis as a method for qualitative data analysis.
I have a very generic question, more prone to be a discussion. As you know, the number of papers devoted to modular networks (networks with communities or partitioned networks) is increasing. The majority of research is about detection of communities. My curiosity is the following: There are papers dealing with the problem of ranking nodes across communities. For instance, a generic ranking can be obtained as an aggregate form of rankings locally obtained within each community (using some centrality measures). When you aggregate rankings obtained within different communities, you can use for instance "primus inter pares" style. For instance, you can compare nodes in different communities only if they have earned the same rank in their different communities. This is a simple idea and, for sure, not the unique idea. I have been searching for papers dealing with those types of conceptualisation or applications on modular networks but I haven't found very much on this issue. For instance, Something similar is done when they compare teams in different European football leagues (communities) since they have to form "qualification groups" for the final part of the Champions league (across communities). But they use scores obtained within the specific football league. Thanks very much for your attention.
Hi there. I'm trying to make SAOM work with an innovation network based on cooperation cuz it owns advantages on dealing with longitudinal networks. And here's something I'm a bit confused.
1. How could I tell if my data is suitable for this model? Are there any specific metrics?
2. I was planning to use SAOM to explore the effect of multidimensional proximity on the evolution of innovation cooperation networks over 4 time observations:
array(c(N1, N2, N3, N4), dim = c(59, 59, 4),
which means it is an undirected, longitudinal type. But it required 3 (waves)input observation network data as a dyadic covariate(?). So is it proper way if I add the change of collaboration times(weights of the edges) between two actors for each period as: edge→varDyadCovar→array (c(edge1-2, edge2-3, edge3-4), dim= c(59 ,59 ,3)?
3. If I want to test the effect of the proximity indicator on the network evolution, is it necessary to add him to the model at the beginning:
mydata <- sienaDataCreate(cooperation,
or use it in a subsequent commanding:
myeff <- includeEffects(myeff,interaction1="GDP")
with command like this?
4. In the official RSiena manaul, it is mentioned that 'A default model choice could consist of the outdegree(density) and reciprocity effect', but my output only contains the rate of network evolution and outdegree calculation results. How could I view my input variables influences in the Siena-table?
Looking forward to your thoughts and thanks in advance!
I have questions regarding comparing frequencies between groups. I will be happy if someone can help.
So I will describe first briefly my research design:
- I am analysing online shaming that has per se 10 types (10 types of shaming).
- I am analyzing 6 cases (multiple case studies) of shaming events.
- I am using thematic analysis.
- In the data (comments from social media) I am analyzing how many types of shaming occur in each data.
- In each case, I have obtained by thematic analyses how many times each type of shaming occurs (per se type 1 occurs 50x times in case 1, type 2 occurs 124 times in case 1 - type 1 occurs 12 times in case 2, type 2 occurs 32 times in case 2 etc).
- The 6 cases will be grouped into three groups by the theory, so I will have three groups (in one group there will be 2 cases, in the second group there will be another 2 cases and in the third group there will also be the other 2 cases).
- I want to compare the frequencies of types of shaming between these three groups.
So how do we compare frequencies/proportions between groups?
I must notion that the number of all codings was different in individual cases. For example in example 1 the number of encodings was - say 1,200, in example 2 the number of encodings was - say 800. A number of codings = number of all codings related to the types of shaming. So I can't just count these frequencies, but I have to weigh them. Does anyone have an idea how to compare the frequencies between different cases where the numerus are different?
Thank you so much for your help.
SHORT QUESTION: How to compare frequencies between groups where in each group there are different cases and each case has a different number of total codings (thematic analysis)?
I'm looking for datasets containing coherent sets of tweets related to Covid-19 (for example, collected within a certain time period according to certain keywords or hashtags), containing labels according to the fact they contain fake/real news, or according to they fact they contain pro-vax / anti-vax information. Possibly, the dataset I'm looking for would also contain a column showing the textual content of each tweet, a row showing the date, and columns showing 1)The username /id of the autohor; 2)The username/id of the people who retweeted the tweet.
Do you know any dataset with these features?
I'm looking forward to studying the hypersexualization on TikTok using a descriptive approach and case study method, however I have a problem in choosing my pursposive sample:
- I have a list of challenges (the so-called sexy challenges) that represent the main features and commonalities but from 2019 to 2021.
- I have a list of the most followed Tiktokers with a hypersexualized content most of the time.
In your opinion which one should I use?
I try to help a doctoral student (my daughter, actually) to do a meta-analysis on a medical topic. I would like to apply social network analysis to a bibliographic survey (authors, key-words, abstract and citations). I plan to use the CiteSpace application which looked to me a good option at first glance. Unfortunately for me, being an independent researcher, I have no access to institutional databases and must use open source such as Google Scholar for my bibliographic research.
But the software is mainly dedicated at analysing Web of Science type of data, even if the manual mentions the option of using bibliographic records from other licensed sites like PubMed (with the same issue of requiring licensed access).
Do you know ways of formatting Google Search results according to Web of Science format, either by hand (csv file) or using dedicated applications (e.g., R scripts)?
Thanks in advance, Hubert
I'm searching about autoencoders and their application in machine learning issues. But I have a fundamental question.
As we all know, there are various types of autoencoders, such as Stack Autoencoder, Sparse Autoencoder, Denoising Autoencoder, Adversarial Autoencoder, Convolutional Autoencoder, Semi- Autoencoder, Dual Autoencoder, Contractive Autoencoder, and others that are better versions of what we had before. Autoencoder is also known to be used in Graph Networks (GN), Recommender Systems(RS), Natural Language Processing (NLP), and Machine Vision (CV). This is my main concern:
Because the input and structure of each of these machine learning problems are different, which version of Autoencoder is appropriate for which machine learning problem.
Software Experts: Ever wanted to write a book? Here's an opportunity close to it that you may not want to miss. Please see
for more details.
I have a task on my social networks analysis course and I need to find a full dataset with the Medici family and its attributes. I'd be glad for any help.
If you are conducting a social network analysis (SNA) to identify a knowledge broker within a project, is it okay to describe the project in broad terms within your paper? Or, does it always have to be anonymous? If you know any examples where an SNA paper describes the project, can you link it for me? Thanks!
We all know the statistics that Facebook is the most popular social network in the world. But which social networks are the most popular on a smaller level (e.g. country or region)?
I am not only interested in statistical evidence but also in your own impression and sense: are there networks or plattforms in your country/region that seem to be more popular than Facebook? If so, do you feel like there is a reason why Facebook is not the most fancied network?
For my doctoral research, I have a dataset of 8 teams, with 2 teams each from 4 organizations, and I am checking peer centrality in team advice networks using centrality. These are directed networks. I have created adjacency matrices and each matrix has 12 to 30 nodes.
- Should I test each team network individually or club them to get organizational correlations? Should be there any other partition applied?
- What should be my main considerations when working with visualization of small networks?
3.I used the Yi-Fan Hu layout (output) for betweenness centrality related to general workplace advice when I ran the first trials. What should I be using for best rendering?
4. What tests should I run and should I report it in writing?
I have started writing an article related to "Narcissism" and its effect on "Social Media". Can anyone suggest a good quality journal where I can submit it. Basically, it will be a systematic literature review paper.
I do not finish my writing yet, so I am not able to share my title or abstract yet.
I am solving a Social Network Analysis problem. I have 9 centrality measures in my problem and I am trying to combine them for creating a new centrality measure.
I have chosen TOPSIS as a combining method. Now I am looking for an easy method to assign appropriate weights to my criteria.
If you think you can help me and even introduce me to a better solution than TOPSIS, I will be glad if you share it with me.
What are the main feature you would extract from a social netowrk to model wellbeing and mental health.
Also what the common formulas for the following features: Engagment, popularity, participation, ego.
Social Network Analysis Datasets Needed?
I need the following datasets and/or any datasets that has one or more of the following features.
The datasets should allow interaction among Online social users, recommender systems and Online social network server or in a decentralized systems.
Other static profiles (e.g. interests, locations) that could be preserved by privacy schemes.
Thank you in advance.
So, I am wanting to take network data from various time periods of the same network and map it in a way that will allow for clear representation of each time period. For example, taking network data for a 2013-2015 time period, a 2015-217 time period and then a 2020-2021 time period. Importantly, the amount of data will be different and the time periods won't strictly be even (i.e. there might be a 1 year gap between time period a and b and then 5 years between b and c). Also, I think it is important to highlight that these would be 'time-slices' of the same overall network, rather than distinct, unrelated networks.
I am hoping to present these different time-slices in one or two ways. First, I am wanting to be able to place a series of network maps next to each other based on temporal data to show how the network is changing over time. Second, if possible, I would like to produce a video/animation that shows the network changing over time.
I have been doing lots of reading on possible ways to achieve this type of analysis. I have been looking into using stochastics, specifically Markov models or Stochastic Actor-Orientated Modelling (SAOM), both of which I have seen used for similar projects. Only problem is my maths is good, but needs some work before I could comfortably use these approaches, so if stochastics is the way forward, any suggestions for good tutorials?
I have also been looking into Social Sequence Analysis, specifically the use of Network Methods, as outlined by Cornwell (https://www.cambridge.org/core/books/social-sequence-analysis/network-methods-for-sequence-analysis/FFF842AF37364167E23AD03E50650336) which seems promising.
I feel as though I am reading a lot and getting a bit lost in all of the literature. Any advice would be greatly appreciated!
Does data from contact tracing help in establishing patterns of behavior and social interactions that lead to infections? There are cases here in the Philippines where patients have no travel history but still get the virus. It is probable that patterns of behavior of other members of the household (for example, working in enclosed and densely populated workspaces) might cause the infection. Just a thought.
I would like to know if it is possible to use SAOMs (Stochastic actor oriented models) to analyse weighted networks?
Thank you in advance,
How can we consider linguistic differences in the analysis of Twitter data?
I'm trying to use tweet data as a proxy for mental health status worldwide. How can I minimize the bias due to linguistic differences as I may miss some data in languages other than English?
The field of social network analysis and all its quantitative methods appear to be an interesting way for material culture analysis in anthropology and archaeology. Since I want to gradually integrate this big body of knowledge into my archaeological toolbox, I need to know where to begin. What to learn first. I’m curious to know when in an undergraduate program does this type of knowledge is taught?
I might follow one or two undergrad courses, read books and do some online mooc. But what should I learn first? Basic quantitative sociology, social sciences quantitative methods, programing (and which language), statistics, mathematics and graph theory? It seems to me that there is a big learning curve. In the end, I must be kept in mind that I don’t want to turn myself into a mathematician/statistician. I only wish to improve my archaeological researches with quantitative methodology.
Hi, basically the question above. I am very interested in ideology & cognition (language) and my interest is in looking at ideology as it moves around on social media. I have come to this point via applied linguistics (cognitive linguistics), where i looked at 'Alt-right' youtube content and the language used and how it is representative of the worlds we mentally construct. Cognitive linguistics is the perfect tool to conduct ideological linguistic analysis (political ideology, 'fake news', propaganda, etc.).
However I am now very interested in using this analysis to look at how ideology 'behaves' on social media (eg twitter). And it is now my understanding that the best way to conduct this analysis would be through Social Network Analysis (and data mining). So, to reiterate, what skills would i need to aquire to be able to conduct SNA (or even just NA)?
I would like to know if it is currenly possible to use temporal ERGMs (Exponential Random Graph Models) for analyzing weighted networks?
For now, it seems that software packages available to analyse TERGMs (tergm or btergm) only use binary networks.
Thanks in advance for your answer,
I want to generate some nice prediction plots from my MRQAP model. I've laid out my process below, and would be very grateful to get anyone's insight, as I'm not seeing much written about this online.
I am building my own regression models on network data in R, using quadratic assignment procedure with Decker and colleagues (2007) double-semi-partialling method. In other words, I am predicting the weight of an edge given its respective node traits. This approach uses node permutations of residuals to adjust for interdependence of observations in the network. (Regression with networks involves huge heteroskedasticity, because the observations are literally connected).
Traditionally, this method (MRQAP with DSP) just produces a p-value, and original standard errors are suspect. So, I am using a Doug Altman's method to back-transform p-values into new standard errors that better reflect the actual error range (read more here; thanks to @Andrew Paul McKenzie Pegman: https://www.bmj.com/content/343/bmj.d2090). This at least allows me to make nice dot-and-whisker plots of beta coefficients and with their confidence intervals (estimate + se*.196, etc.). However, I'd still really like to make predictions.
There seem to be two logical routes to make predictions from an MRQAP model.
First, you could just make predictions normally.
This relies on your observed residuals in the model to calculate the standard error for your predictions. I think this might even work, because the homoskedasticity assumption in regression is really about covariate standard error and p-values, not prediction; this means that a heteroskedastic model can still produce solid predictions (see Matthew Drury's & Jesse Lawson's helpful notes here: https://stats.stackexchange.com/questions/303787/using-model-with-heteroskedasticity-for-predictions). However, I would love some external verification on this. Any sources I can draw on to be confident I can use this for visualizing predicted effects from networks?
Second, you could simulate the predictions, like in Zelig/Clarify.
Simulation requires building a multivariate normal distribution, where each vector has a mean of one of your model coefficients, and where the vectors share the same general correlation structure as your variance-covariance matrix. Then, you make a sample from this multi-variate distribution (eg. grab a row of observations from each vector), use these as your coefficients, and generate a set of predictions. You then repeat this about 1000 times, grabbing different sets of slightly-differing coefficients.
In other words, this approach comes with a few assumptions: 1) Your coefficients might be slightly off, but if they're wrong, they follow a normal distribution. 2) The distribution for each coefficient is related to the other coefficients in specific, empirically observed ways. 3) These distributions don't necessarily have standard deviations that reflect the nice new standard deviations generated from our DSP p-values! Ordinarily, I'd think that you'd want a multivariate normal distribution where each assumptions 1 (normal) and 2 (correlated) apply, but where you've also constrained each coefficient's distribution to reflect the standard errors from DSP. But there doesn't seem to be a good way to do this, since standard error doesn't directly factor into making a multivariate normal distribution (to my knowledge). You mostly just need the mean (coefficients) and a variance-covariance matrix.
To any kind souls out there who have read this far, what would you recommend? Should I just use normal prediction? Should I simulate with a multivariate normal distribution? Should I make some weird third multivariate-normal-distribution-that-somehow-resembles-my-standard-errors-made-indirectly-from-MRQAP-DSP?
Any thoughts would be appreciated!
In the times of COVID19 pandemic when most of the academics and research have shifted to online and distance mode, I wonder if there are postdoc positions available without the restriction of being physically present in the lab/ university. Any suggestions or links to open position advertisements can help me in grabbing a postdoc position.
I would like to do a social network analysis based on the forum interactions messages, but I don't know how to build the database for it. Can you help me?
I am more interested to know quantitative techniques for assessing social network. IN this regard-
1. is there any existing model/theoretical framework?
2. which is/are appropriate software package to assess the said network?
I am looking for a database where social networking sites statistics such as facebook, twitter etc are available for Australian companies. Has anyone an idea whether something like that exists and/or if there is a software which will extract that data?
Considering researchers that come from qualitative methods background, where is a good place to start? What handbooks/articles/etc would you recommend? Is there interesting online sources to look into? I want to compile some resources for students and colleagues that want to know how to apply SNA.
To study the students' interaction patterns, which one is better either social network analysis or epistemic network analysis?
Recently, I became aware of the social network analysis (SNA). In my mind, the way that SNA works seems quite close to fuzzy cognitive maps (FCM). So, I would appreciate aby feedback on the following:
1) Which are the main differences between SNA and FCM?
2) In which areas/field these techniques are more applicable?
3) Is there any up to date (time is subjective) literature on these topics that relates to environmental/agricultural policy?
Thanks in advance!!!
E-learning implementation relies on the effective modeling of (among others issues) content, participants-learners, content providers, infrastructural supplies, technology and the organisational culture thereof. These actors do not affect the end result in isolation but as a collective. The effects of each player on the whole can roughly be explained using Activity theory or Actor-network theory but this is not very holistic. I have a feeling that the SNT is better suited to investigate this problem especially that culture has an important role in shaping the effectiveness of e-learning systems.
I am about to design a survey that measures the long-term impacts of social change of a specific scholarship program. I am currently struggling on how to design the questions for the survey, which should both express relational data and measure, from the single student's point of view, the impacts the student thinks to have on his/her society and social networks.
does someone have a clearer idea or examples of previous researches done qualitatively with the SNA?
Let G = (V,E) is a graph, we have to pick K vital nodes in the network that are capable to spread the information throughout the network. So, how we can define? Either we are going to pick set of key players or K individual players.
Research paper is attached here and link is also given.
- Ortiz-Arroyo, D. (2010). Discovering sets of key players in social networks. In Computational social network analysis (pp. 27-47). Springer, London.
To publish a short communication on pattern recognition, RS, social network analysis, online news etc.
I am referring to a paper on Social Network Analysis. the authors have conducted centrality measures which I was easily able to find in the software UCINET and Socnet.
They have also measured state centrality, link betweenness centrality and out-status centrality which I am unable to find in UCINET 6.
Kindly share any tutorial or mention the steps to find the state centrality, link betweenness centrality and out status centrality.
I want to convert an unweighted graph to weighted for solving the link prediction problem. Is the best way to transfer from an unweighted graph to a weighted graph to consider the similarity between nodes?
Beyond the beautiful lines or webs showing how connected an individual is, what other meaningful analysis could be done? for instance, are those directly connected the the individual more important or useful to those connected through others?
I think I can study this behavior with some actions like:
following the trend challenges
posting on Instagram about trend news (to have a reaction about trend news I mean)
What do you think?
would you please inform me about this?
I have already experience with "muxViz" tool and "multinet" package in R. They use some predefined layer like "Fruchterman-Reingold". After plotting the network, I need to change the position and color of a group of nodes, however these tools do not allow such changes. I wondering if there is any library like tkplot in igragh (which is an interactive visualization tool for single-layer networks).
I have a number of ongoing researches on adaptive web mining techniques and online social network analysis with applications.
Collaboration with funding support for presentation of research outputs in top conferences, workshops and international journals is highly solicited.
Please you can contact me via firstname.lastname@example.org
I have been working with my smartphone for a while and can complete 90% of the tasks without problem. I am really happy as I become very flexible in doing research. However I am not sure if I use an efficient way of doing Social Network Analysis. Can you advise tools (apps, browser based portals etc.) if you have do social network analysis in your smartphone?
I am a masters student software engineering, I looking for advice or ideas on how I can find topic related to link prediction or node importance estimation in networks. thanks.
For the social Network Analysis, it was hard to find one large organization so I am collecting data from different small organizations in different sectors. I want to combine all of them in one study. Wanted to make sure if there is any technical lacking in my approach, or it would be possible to aggregate the results of all? In that case, How can I interrelate groups from different organizations?
In SNA, some scholars used parametric statistical tools and more people used non-parametric tools in doing test of ranks, correlations, etc on SNA network descriptives/metrics. Which is the right one to be used for SNA data, the parametric or non-parametric tool?.
We can see that in social network analysis, marketing, opinion researches sentiment analysis has made serious mark. Web scrapping provides huge data sets and provides making conclusions and that all is making a new era in statistical analysis.
Is there a place for classical macroeconometrics?
I have seen some text analysis for international business, also I can see that for micro problem you can easily use, let say R's RQDA and that topic modeling can have its place in text analysis of any kind.
Also, I do know that in private sector machine learning is applied for forecasting.
But, yet it seems that academia is rigid in that sense.
I need to do longitudinal social network analysis, someone recommended SIENA, a package in R. Has anyone used this? Is it easy to use this package in R? What data formats do they need?
Any other suggestions?
Hi, fellow researchers! My case is a little bit special. I am doing multiple regression (one dependent variable & several independent variables). My data does NOT have any multilevel structure (e.g., students nested in classrooms, etc.). However, when measuring the dependent variables, measures for different subjects are dependent with each other. Say that Y is the dependent variable, and y is one realization of Y. Then y's are dependent with each other. In order to deal with the dependency among y's, I used permutation test for regression coefficients. I used 'lmp( )' function in lmPerm package. Then based on permutation tests, R calculates sum of squares. So does it make sense to calculate R square based on these sum of squares? Thank you!
I'm working on a social network analysis problem.
My goal of the analysis is to predict links in this network.
My network is bipartite and i want to split it into train, validation and test sets ( every set is a network ) in order to check my model efficiency.
what is the best way to split my initial network into train, validation and test networks.
I am writing my master thesis, which concerns quality soil analysis related to relational goods in small farms. The social relations were analysed throught questionnaires. The questionnaires highlithed all the external relations that a company has. I've used gephi to draw the networks but I do need now some easy-do indicators to formalize what I've depicted. Do you have any suggestions? Also in terms of comparison indicator between biological results (soil quality) and relational goods.
I proposed a comprehensive recommender system for e-commerce usage, but unfortunately i can't find any data-set for evaluation step. I need a data-set containing:
2- Product features (category, price, color, brand, author, RAM and etc. that can be diverse according to the category)
3- User demographic information (age, gender and etc.)
4- User purchase history
5- User browsing history (visiting product's page)
Can anybody help me to find a data-set with this features please?
Publishing and citing behavior of journals vary across fields. In different fields, different dissemination channel of research activity are preferred, such as in social science books are preferred over journal articles, and in computer science, results are mostly published in conference papers. The number of references per article also vary across disciplines. Similarly, some journals are multidisciplinary, some are open access and some are closed access. Impact factor does not solve the problem of journal comparison across domains. Which metrics and measures or factors could be important in comparing journals in and across the discipline? Reference to any related article will be highly appreciated.
I am trying to triangulate the analysis with a social network analysis of specific emergent relationships in NVivo.I do not seem to get clarity from the NVivo guidelines?
I am currently learning how to construct multi-level network in social network analysis. But I have lack of learning materials, references, or online courses for this, especially ones using R programming as analytical tool. Do you have any suggestion?
By the way, I am working on case study in the field of agriculture and the subjects would be the smallholder individual farmers and external actors.
Are there hands-on experts who can support/consult on a telecoms social network analysis on community detection, social ties and relationship mapping?
I need your assistance and constructive criticism to a) evaluate parts of my method which are correct b) find weak points and improve on them.
I am far from an expert on ERGM and my case is rather "special" because I am dealing with a large network (most examples that I found were dealing with relatively smaller networks).
I have a network of 7 million edges and 5 million nodes. Nodes have several quantitative attributes. My main goal is to find if these attributes influence the probability of tie formation and if people with similar values tend to have a higher probability to form relations.
Since the network is too large, I took an uniform independent sample for 26697 nodes. The sampling method is favored by literature (see for example http://www.minasgjoka.com/papers/wosn2012-kurant_coarse-topology.pdf). All of their edges even relations to nodes that were not in the sample were included. The sampled network had 39983 nodes and 67024 edges. Then I built a couple of models and I have their results attached to the text file along with the gof of the last one.
I have several questions regarding my results:
1) Do I have to include network metrics (mutual, kstar, etc) if these do not revolve around my hypotheses? Even if I find any results this will probably be irrelevant to the topic that I am working on.
2) I actually did try to build a model for mutual out of curiosity but got back awful diagnostics for mcmc (even with 100,000 sample and 50000 burnin). Instead of normal plots on the right side of the plots printed by mcmc.diagnostics the plots were truly all over the place.
3) The AIC and BIC seem to be quite high compared to other examples. Does it matter? My suspicion is that this is a result of the size of the network.
4) The gof does not seem to fit the data well in several metrics while it is effective in others up to a level. Given the size of the network I am not sure that I will ever get a proper model that would fit the data exactly. Is this however even relevant? Can I still make assertions about my node attributes affecting the probabilities for tie formation?
ACM, Association for Computing Machinery, well known for its Turing Award (the equivalent of Nobel Prize for Computer Science), pays travel expenses to help your company / college / conference / event / chapter host one of my talks. Below are the details to make the requests. Please feel free to take benefit of this useful service and spread the word. No honorarium needed for the speakers - we serve entirely voluntarily.
Machine Learning for Veracity of Big Data
Approaches to Establishing the Veracity of Big Data
Complex networks have the number of common features, but is there an outstanding feature of each complex network which does not exist in the rest? For instance, the Rich-club property, there is more to be seen in the brain networks.Can we say each of them has a unique feature?