Science topic

Distributed Algorithms - Science topic

Distributed Algorithms are a distributed algorithm is an algorithm designed to run on computer hardware constructed from interconnected processors.
Questions related to Distributed Algorithms
  • asked a question related to Distributed Algorithms
Question
7 answers
Currently, I am exploring federated learning (FL). FL seems going to be in trend soon because of its promising functionality. Please share your valuable opinion regarding the following concerns.
  • What are the current trends in FL?
  • What are the open challenges in FL?
  • What are the open security challenges in FL?
  • Which emerging technology can be a suitable candidate to merge with FL?
Thanks for your time.
Relevant answer
Answer
  1. Communication Efficiency: Federated learning involves frequent communication between devices and a central server. Optimizing communication protocols and reducing communication overhead is a challenge, especially for devices with limited bandwidth.
  2. Heterogeneous Data: Nodes in a federated learning system may have diverse and non-i.i.d (independent and identically distributed) data. Developing methods to handle data heterogeneity while preserving model performance is crucial.
  3. Model Aggregation: Combining models from different nodes without compromising model accuracy or privacy is complex. Aggregation methods need to be robust against outliers, adversarial nodes, and noisy updates.
  4. Privacy and Security: Ensuring that individual node data remains private is a central concern. New techniques for encryption, differential privacy, and secure aggregation are needed to protect sensitive information.
  5. Bias and Fairness: Federated learning can inherit biases present in node data, leading to biased models. Addressing bias and fairness issues across distributed data sources is a challenge.
  6. Imbalanced Data: Some nodes might have imbalanced datasets, leading to biased models. Developing techniques to mitigate the impact of data imbalance on model training is essential.
  7. Node Heterogeneity: Devices can have varying computation power and energy constraints. Designing federated algorithms that accommodate such heterogeneity is important for scalability and inclusivity.
  8. Stragglers: Slow or faulty nodes can slow down the training process. Techniques for dealing with stragglers and their impact on overall model performance need to be explored.
  9. Model Deployment: Transitioning federated models to production environments while maintaining security, performance, and compatibility with different devices is a challenge.
  10. Cross-Domain Learning: Extending federated learning to scenarios where nodes have different domains or tasks is an emerging area that requires novel solutions.
  11. Adversarial Attacks: Federated learning models could be vulnerable to new types of attacks, including those targeting the aggregation process or compromising the central server.
  12. Resource-efficient Algorithms: Developing algorithms that optimize for computation, memory, and energy usage while maintaining model accuracy is crucial for resource-constrained devices.
  13. Regulatory and Ethical Concerns: Ensuring compliance with data protection regulations and ethical considerations is essential in federated learning, particularly when dealing with personal or sensitive data.
  • asked a question related to Distributed Algorithms
Question
6 answers
I was exploring federated learning algorithms and reading this paper (https://arxiv.org/pdf/1602.05629.pdf). In this paper, they have average the weights that are received from clients as attached file. In the marked part, they have considered total client samples and individual client samples. As far I have learned that federated learning has introduced to keep data on the client-side to maintain privacy. Then, how come the server will know this information? I am confused about this concept.
Any clarification?
Thanks in advance.
Relevant answer
Answer
Thanks for your input. I have their codes. They have followed the same. I have attached their code below.
  • asked a question related to Distributed Algorithms
Question
7 answers
There is an idea to design a new algorithm for the purpose of improving the results of software operations in the fields of communications, computers, biomedical, machine learning, renewable energy, signal and image processing, and others.
So what are the most important ways to test the performance of smart optimization algorithms in general?
Relevant answer
Answer
I'm not keen on calling anything "smart". Any method will fail under some circumstances, such as for some outlier that no-one have thought of.
  • asked a question related to Distributed Algorithms
Question
2 answers
Frameworks such as Apache Storm, Flink, Heron and Spark were developed to run on clusters or cloud. These such kinds of infrastructures do not have memory, CPU and bandwidth limitations. In contrast, computing resources at the network edge are constrained regarding their capabilities. I am aware of the Apache Edgent and Nifi frameworks. However, they were conceived to run locally on a single computing resource. If you want to run them in a distributed infrastructure, you might create your own stack of components (broker + framework).
Relevant answer
Answer
Himadri Nath Saha, it seems to be a fascinating motivation scenario. The investments and the continuous growth of streaming games have imposed new requirements of response time (i.e., ultra-low latency). I am looking for DSP systems where latency-sensitive applications can be easily deployed on geo-distributed constrained resources avoiding creating a multi-tier of components.
  • asked a question related to Distributed Algorithms
Question
5 answers
Please can you tell me what is the motivation of applying multi agent system to the ant colony algorithm since it is a distributed algorithm by nature? does it really improve the solution ?
Relevant answer
Answer
Amal
There must be an improvement since with MAS you have means to negociate, cooperate, that do not exist whithin ant colony alone. You must ask yourself on the existance of works that enforce ant colonies with MAS. The conclusion is that MAS are generally used to facilitate Ant colony communication and automation.
Hope this will help you
Hakima
  • asked a question related to Distributed Algorithms
Question
2 answers
Hello Every One.
Was trying to implement an algorithm Join Idle Queue in CloudSim. I want to improve the response time in comparison to other policies such as SJF and Minimum execution time. But i realized that in basic package of CloudSim the Submit Cloudlet method of DB class can only be used to allocate cloudlets to Vms on the basis of static approach. to activate my loadbalancer to get an idle Vm i am unable to use this method because all the clodlets are allocated to vms on 0.1 clock time, if i active my loadbalancer here i got an idle Vm but not at 0.1 clock time and unable to allocate the cloudlet to it in this method. somebody told me use powerdatacenter. Guys please help me if anybody knows what to do it will be a great help for me.
Thanks.
Relevant answer
Answer
Thanks a lot Sir. Please refer me if you can.
  • asked a question related to Distributed Algorithms
Question
2 answers
I want to know which algorithms do you use in your integrated active chassis control system framework and do you consider the over-actuation due to four electric motors. If so, which torque distribution algorithm is employed in your project?
Sincerely,
Aria Noori Asiabar (M.Sc. degree in vehicle dynamics and control)
Relevant answer
Answer
@Zeashan Khan
Thanks a lot.
  • asked a question related to Distributed Algorithms
Question
3 answers
I need a simulator to simulate a network of surveillance cameras, to implement my distributed algorithm (I use smart cameras, computer fog and cloud computing)
Relevant answer
Answer
This paper might help you..
Simulation of a Video Surveillance Network Using Remote Intelligent Security Cameras
  • asked a question related to Distributed Algorithms
Question
5 answers
Please can you tell me what is the motivation of applying multi agent system to swarm algorithms (ie. the ant colony algorithm) since it is a distributed algorithm by nature?
Thank you
Relevant answer
Answer
Multi-agent systems run on the distributed environment (Sometimes in a single computer) capable to communicate with each other through the message parsing. Through the communication, agents can solve complex problems easily. If you use a correct framework it is not probably slower. However, algorithm processes are strata forward. Note that MAS has the different approach than the algorithmic process. In addition, MAS can provide results that cannot previously predict. Try it.
  • asked a question related to Distributed Algorithms
Question
4 answers
In a Research Consultancy, as part of the MBA requirements, analysing data can be complex, particularly of that data is extracted from verbal responses...
Relevant answer
Answer
Dear Nicky,
I don't know much about Fuzzy Cognitive Mapping or Normal Distribution Algorithm, so I can only give you some hints about Factor Analysis, but I would say that first of all the method you choose should fit your research question and not only the type of data you have.
If you aim to looking for latent structures and structural factors, you could use Factor Analysis. FA works with ordinal data if you base the construction of factors on polychoric correlations. The lavaan package in r allows you to do so, if you specify you have ordinal variables: http://lavaan.ugent.be/tutorial/tutorial.pdf
Good luck!
Mireia
  • asked a question related to Distributed Algorithms
Question
5 answers
Hi Everyone,
Can someone suggest me any summer school (2018), which focus on algorithms. More specifically, distributed algorithms, experimental algorithms and relevant topics.
Your suggestions will be highly appreciated.
Thanks in Advance.
Relevant answer
  • asked a question related to Distributed Algorithms
Question
1 answer
I am trying to develop a distributed algorithm and search for current ones. Your help would be nice
  • asked a question related to Distributed Algorithms
Question
1 answer
I'm still working to understand estimation of distribution algorithms (EDA) as applied to genetic algorithms.  Can the probabilistic models used by EDA for generating new solutions be used by itself?  For example, in Bayesian optimization algorithms (BOA) can the Bayesian network that is produced be extracted and used separately as a Bayesian classifier?
  • asked a question related to Distributed Algorithms
Question
2 answers
What could be an algorithm for computation of Pearson cross-correlation matrix in a distributed environment where my data is divided by id(say: 1-4) and time(say: Jan-Dec) among different nodes. Say node A({id1, Jan}{id2,Jan}), Node B({id3, Jan}, {id4,jan}).
I'm wondering what strategy I could use where I do not have to ship large data from one node to another node as Pearson correlation is a pairwise computation. I'm ok with just transferring small intermediate result between node or How I should partition my data based on id and time so that I efficiently calculate cross-correlation matrix among multiple ids.
Relevant answer
Answer
Your data is partitioned on both on time and id. so you can say 
You wanted to calculate the cross correlation matrix. node A({id1, Jan}{id2,Jan}), Node B({id3, Jan}, {id4,jan}) again for feb node A({id3, feb}{id2, feb}), node B({id1, Feb}{id2,Feb})
Jan,Feb : Data generated in months on jan and feb. You can say id1, id2 are data from different sensors.
  • asked a question related to Distributed Algorithms
Question
6 answers
It is very  time consuming for solving large-scale NP-hard problems. Distributed algorithms based on Map-reduce can speed up the computation and have many successful applications. However, meta-heuristic algorithms such as Tabu search and simulated annealing algorithm are based on single-solution iteration, Hadoop is not good at iteration computation, so few work can be found on distributed iteration meta-heuristic algorithms based on Hadoop or Spark. There is any good implement or idea on iteration meta-heuristics based on Hadoop or Spark? What is its challenge?Thanks.
Relevant answer
Answer
you can check this:
Tauer, G., & Nagi, R. (2013). A map-reduce lagrangian heuristic for multidimensional assignment problems with decomposable costs. Parallel Computing, 39(11), 653-668.
Chicago
  • asked a question related to Distributed Algorithms
Question
4 answers
Several techniques can be applied in Peer-to-Peer networks to handle the presence of malicious nodes (Reputation Systems, Accountability, Distributed Consensus, etc...).
Which one do you think has the best trade off between the capability to discriminate malicious nodes from honest ones, and the cost of the technique (in term of the number of messages for example). Obviously, knowing that none of the previous systems can fully decide (with a 100% accuracy) whether a node is malicious or not .
Relevant answer
Answer
In summary: for very very large p2p systems (file-sharing), I would go with a reputation system / shared history; for small ones, I would go with accounting, bilateral interactions (private history).
  • asked a question related to Distributed Algorithms
Question
1 answer
I want to know which algorithm is regularly used for Approximation algorithm and Distributed algorithm in Dominating set and Wireless Sensor Networks?
Relevant answer
Answer
Commonly we can find different choises in the application of Approximation algorithm. Three most used are:
  • Basic Distributed Greedy Vertex Cover Algorithm (adaptation of Parnas and Ron’s algorithm).
  • Distributed Vertex Cover via Greedy Matching Algorithm
  • Distributed Vertex Cover Algorithm with Breadth-First Search Tree
When we inspect the theoretical performances, we find out that the best one is the first algorithm.it has been observed the worst performance with the BFS tree based algorithm. The reason is that asynchronous BFS construction is a costly operation.
I hope it will be helpful.
best regards
  • asked a question related to Distributed Algorithms
Question
2 answers
implement a  Mutual Exclusion Algorithm with simulator software. I don't have a laboratory to test my  algorithm
Relevant answer
Answer
Hi Dimitry,
There exist many simuators and simulation frameworks (omnet++, NS2 , NS3, simgrid ...). So it is difficult to give a general answer. It depends on the pupose of simulation :
  1. do you want to execute an algorithm and collect some data ?
  2. do you want to have a visualization of your algorithm ?
  3. do you want to study the termination ?
  4. do you want to execute some simulation campaigns, increasing the number of nodes of your system , the topology of your system ... ?
  5. do you want to just play an algorithm, or do you plan to test the algorithm within a particular environment / protocol ?
With all those questions, there is an other one, are you looking for a simulator in a special language ?
If the C++ does not mind, you can use omnet++ ( https://omnetpp.org/ ) . This simulator can be used for algorithms simulation, or protocols simulations, offer a nice programming environment, and allow visualization of the executions.
Best regards
Olivier
  • asked a question related to Distributed Algorithms
Question
5 answers
locate the centre node in a wirelss sensor network
distributed algorithm
polynomial complexity
Relevant answer
Answer
It is right, thank you a lot.
  • asked a question related to Distributed Algorithms
Question
12 answers
Hello everyone,
I would like to make a simulation of a particular Petri net.
Is there someone that can help me with some ideas or principles on this?
Thank you in advance!
Relevant answer
Answer
Some Principles and Ideas
Consider “making a simulation of a petri net” to be the same as “writing a computer program using petri nets”. To use petri nets for writing a computer program is to organize a computer program using Petri Net elements and annotations. To create this organization, first establish the relations of computations in terms of places, transitions, inputs and outputs. Second create annotations of places, transitions, inputs and outputs for the computations.
An Example
Consider making a simulation of a countdown timer. The countdown timer begins from a specified value (say, 10). Using some interval, the timer value decreases by one. When the timer value reaches zero, the countdown ends.
This computation may be modeled in terms of a place, an input and a transition. The place has a mark annotation, a value that represents the timer value – the place is the timer. The input has a “fire” annotation, a computation that deducts one from the input place mark annotation. The input also has a status annotation, a ‘true’ value means the computation may proceed and a ‘false’ value means that the computation should not proceed. The input has an “is enabled” annotation, a computation that determines the value of the status annotation – in this case, if the mark annotation of the input place is greater than zero, the status annotation of the input is ‘true’ and ‘false’ otherwise. The transition has “fire” annotation, a computation that delegates the “fire” computation by “firing” its input. The transition has an annotation status, a true value means that the value of its input status annotation is ‘true’ and ‘false’ otherwise. Consider too visual annotations of the net elements: a circle for the place, a square for the transition and an arrow from the circle to the square for the transition, and dots for the mark annotations of places – empty for 0 mark, one black dot for 1 mark, two black dots for 2 marks, etc. [The types of annotations considered so far are those found on Place/Transition nets, a kind of Petri Net.]
To make the “simulation” interactive, consider an event annotation for the transition, a computation that delegates a user event (such as a mouse up event) to “fire” the transition. Furthermore, consider a visual annotation for the status of the transition – green if the status annotation is true and empty otherwise.
Based on this example, I created an interactive “simulation” of this system in PDF. Graphics were created using PowerPoint and exported as PDF. Computation logic was implemented as JavaScript programs. And interactions were integrated with the computation logic using the Acrobat/JavaScript API. [See link attached to this reply].
A Reference
For petri nets with many net elements and annotations, several other issues (such as naming conventions, high-level graphics conventions, and software engineering workflow processes) must be addressed. Here is a research paper related to this topic – Net Elements and Annotations: Computations and Interactions in PDF. [See link attached to this reply].
  • asked a question related to Distributed Algorithms
Question
1 answer
i am working in distribution system side for comparing and analyse  my results according to my objective function. i am having some problem with 118 bus system because according to my load flow the base case power loss   i am getting is 1291kw but seeing so many papers it is 1296kw. than i have doubt which is correct?
Relevant answer
Answer
i am taking bus data and line data of 118 bus system and enter to my load flow and run the load flow getting base case power loss is 1291kw but compared some papers it is 1296kw which is correct?
  • asked a question related to Distributed Algorithms
Question
2 answers
I want to test some parallel implementations of greedy algorithms that solve NP-complete problems such as set-cover or max-matching.
Relevant answer
Answer
You clan try lemon or igraph even if all the algorithms are ont parallelized. However, y ou clan take a look to bob++ (google : Bertrand Lecun, bob++) .
  • asked a question related to Distributed Algorithms
Question
6 answers
A multi-writer shared storage is the one that allows two writers to modify the same storage object concurrently without imposing any locking. Usually in cases of write collision one of the write operations is visible and the other is hidden, meaning that no read operation will later observe this write. I just wonder if such behavior is useful for any existing applications. For example if it was that the shared storage would implement a file system then would it be OK if one writer would overwrite the changes of a concurrent writer on a single file?
Relevant answer
Answer
@Peter: I think that the way this question is asked is not so much about consistency in the DB. Though consistency is certainly a problem with this approach.
I can't think of any situation where this kind of multi-writer approach is useful. Suppose the current state is A and writer1 wants to write B, writer2 wants to write C. In many cases writers think that the current state is what they have written. Sometimes this cannot be assumed, so each writer reads back the current state directly after writing. Suppose that writer1 wins the battle and writer2 will never write C (i.e. it is hidden, just as you explained). Now writer2 wants to read back the current state, but writer1 does not have finished writing. Which state does writer2 get? A, B or C? (B only works if writer2 waits until writer1 is finished -> you need synchronization again.) Since C will be hidden it only can be A or B. We can't read A because writer2 knows that it is older than C. And B is not available yet. What do you do now?
In another scenario, maybe when writing to a log file where there is no reader during execution of the application, I don't see why we would throw away any writes. We could use one thread that handles a queue and both writers can enqueue their write commands almost without any delay (lock-free data structures come to mind).
  • asked a question related to Distributed Algorithms
Question
2 answers
Given an auto-scaling system, we face inputs that have unpredictable patterns and volumes. Because they are allocated per input resource, fluctuations of input volume have much overhead of resource. Do you identify an algorithm that can help to systems performance?
Relevant answer
Answer
Dear Kareem
Salam,
According to above question, i have a auto-scaling system with swinging input volume and i want to allocate my resource in the best way.
On the other hand the pattern of my input volume is unpredictable however, is there any algorithm that can help?
  • asked a question related to Distributed Algorithms
Question
6 answers
I am working on streams of dataset which yields 256 bit strings. I want to cluster them in real time, (i.e. i don't want to save them in memory, and give each instance a cluster id in real time).
I have a similarity measure for these strings defined by number of 1's in ANDing to the number of 1's in ORing the two strings. (very similar to Jaccard similarity)
I initially worked with sequential leader clustering for this, and it gave great results. The only problem is that, it cannot be applied for distributed systems.
I have found that minhashing and LSH can be implemented for real time text clustering. As per my understandings, Minhashing is used to generate signatures, and then signatures are clustered by banding of the signatures. And, if hash functions are same, the minhash signatures are same in every node in distributed settings.
Can there be a LSH approach for clustering the bitstrings in my case with the defined similarity measure??
Or can i apply minhashing/LSH for minhashing with some trick to get the desired results?
Relevant answer
Answer
.
minhash is indeed a locality sensitive hashing scheme when locality is defined according to the jaccard similarity
see (by increasing completeness) :
and
for an example of implementation
.
  • asked a question related to Distributed Algorithms
Question
4 answers
Both require data partitioned. Can distributed algorithms be considered a subtype of parallel algorithm?
Relevant answer
Answer
I'll try to answer in layman's terms (no guarantee of being formally correct).
Distribution has to do with where the computation physically resides. A distributed algorithm is executed on multiple CPUs, connected by networks, buses or any other data communication channel. Parallelism has to do with the fact that in the algorithm two or more flows of control may execute (even if only virtually) at the same time.
True (not virtual) parallelism requires distribution. But parallel algorithm may execute also on virtual machines realized on a single physical CPU.
  • asked a question related to Distributed Algorithms
Question
7 answers
.
Relevant answer
Answer
platform as a service from open source community and redhat: openshift
this p-a-a-s software is included in linux distribution fedora 19.
  • asked a question related to Distributed Algorithms
Question
18 answers
When we are connecting different DG's to the particular loads, during non peak hours which DG's are to be selected for the particular load? What kind of algorithm can be used for the selection of DG's in a smartgrid?
Relevant answer
Answer
You need a correct mathematical model of the DG operation problem, including the objective function(s) and system econ-tech constraints and acceptable modeling of uncertain parameters, then you should either code a numerical optimization algorithm or use a commercial package (like GAMS) to solve the hour-based daily DG scheduling problem. Then analyze the results to see if they make sense. If they did not, update your model/math prog until you get meaningful results for the next 24-hours. Good luck!