# Distributed Systems

Is anyone aware of a queuing model of a computing grid system (or scheduler)?
Links to papers, or ppts for computer science courses would be appreciated.
Vasileios Kolonias · University of Patras
Hello, also torque is another one: http://www.adaptivecomputing.com/products/open-source/torque/ Cheers, Vasilis
Are there any algorithms for fluctuation of input volume handling?
Given an auto-scaling system, we face inputs that have unpredictable patterns and volumes. Because they are allocated per input resource, fluctuations of input volume have much overhead of the resource. Can you identify an algorithm that can help the systems performance?
Christopher Landauer · The Aerospace Corporation
Another manifestation of this same problem might be in local management of a non-stationary random process - if we have a sequence of sensor measurements (for example), limited to 16-bit words (say), then we want to scale the range to maximize the precision (this can be a linear or logarithmic map, depending on the sensor) - it seems like this question is about algorithms to track the extreme values of such a time series to stay as close as possible to the maximum precision - this is a matter of adjusting the data model (the definition of how the bits are to be interpreted) to fit the local empirical behavior of the time series - is this a reasonable interpretation of the original question?
What is the best categorization for scheduling algorithms?
How do we categorize scheduling algorithms in all respect for distributed computing environment?
Matthias Werner · Technische Universität Chemnitz
There isn't a "best" categorization. Depending on your goal, different categorization schemes/taxonomies may useful. Frequently used categories are, e.g., the process model a scheduling algorithm applies to (e.g. preemptable/nonpreemtable), or the scheduling goal (e.g., fairness, preformance, etc.).
Any ideas to group GPU cards memory into one big share memory?
A GPU card has limited amount of memory, how do we increase the memory? In my mind, there are two approaches (may be more out there). First is to group multi GPU cards memory into one. Second is to enable data streaming of the algorithms. Any ideas.
Juan Orduña · University of Valencia
Depending on your final purpose, an alternative solution for grouping GPU cards memory could be using the GPUs in other nodes. If this is the case, then rCUDA could be an effective solution for you. See http://www.rcuda.net/
What are overlapping concepts in Grid computing, Cloud Computing, and Big-data Computing?
Grid computing, Cloud Computing, and Big-data Computing are used as distributed computing mechanisms. What are common and overlapping concepts? Why are research efforts that are spent on one not easily applicable to others? Which research issues can be handled in common? Are research efforts of various researchers used effectively?
Reeta Sony · National Law University, Delhi
Cloud Computing is nothing but an advancement mode of grid computing.Big data is not a computing,it takes support of grid and cloud computing to process the data.Big data is a huge, bulk raw data which is very high in volume,Variety and Velocity.Big data needs very high Computing Capacity to process it and only Cloud Computing can Provide its need with its flexible, scaleable and cost effective nature.
• Nisha Dhankher asked a question:
What is this: Error btl_tcp_endpint.c: 638 connection failed due to error 113 in openmpi ?
On running mpiblast on rocks cluter, this error came on node 10.1.254.236.
Why does CAP (consistency, availability, and network partition) not work for one single data center but work for wide area networks?
From Gilbert/Lynch proof of CAP, I did not find the difference between local area and wide area networks for applying CAP. Someone said that transient network problems in wide area setting would be common, but can be eliminated by proper failure detectors in one single data center. However, as I know, failure detectors always need time to discover failures and may has a probability of false suspicion, therefore transient failures in local area networks can only be reduced with failure detectors but always exist. Does anyone know about this?
Xu Wang · Beijing University of Aeronautics and Astronautics (Beihang University)
I think i really know what you want to say. In one single data center, we can build enough redundant paths between two nodes, then it can be assumed that there is always a live path. But for wide area network settings, there are no such redundant paths. Thanks.
What PBLAS implementations do you use?
There are several implementations of BLAS - Intel MKL, ESSL, GotoBLAS and its more recent variants: OpenBLAS (that I use myself) and Survive GotoBLAS... But they are designed for single (yet multi-core!) computers. What about distributed environments? AFAIK, MKL contains an implementation of PBLAS, but what other than that? The reference Netlib implementation? What can you suggest?
Phil Mucci · University of Tennessee
For most of the time, doing BLAS in parallel over MPI is a loss. This is mostly due to the size of the matrices and the amount of data exchange that needs to happen. In this particular case, it would be helpful to know which BLAS functions you're particularly in need of and what are the dimensions of the matrices. And even better question is what is the higher order algorithm you're trying to solve? Are you doing systems of linear questions? Some other factorization? You'd be much better suited by using libraries that implement that routine such that they can schedule low-level linear algebra operations appropriately and take advantage of as much locality as possible. The NETLIB PBLAS was invented a while back, and there are now much better ways to do it in parallel - task graph driven approaches as used in libflame, MAGMA, PLASMA and Intel's Cluster Math library. These approaches reduce or laminate the latency by using a dataflow style approach, Rather than a static schedule. Phil Mucci Minimal Metrics
Is a graph database preferred for distributed data?
If data is distributed on a different machine then is the use of a graph database efficient or not?
Katarina Grolinger · The University of Western Ontario
Graph databases are specialized in handling highly interconnected data and therefore are very efficient in traversing relationships between different entities. They are suitable in scenarios such as social networking applications, pattern recognition, dependency analysis, recommendation systems and solving path finding problems raised in navigation systems (http://www.journalofcloudcomputing.com/content/2/1/22#B50). Partitioning graph database over networked nodes is challenging task. Graph partitioning attempts to achieve a trade-off between two conflicting requirements: related graph nodes must be located on the same server to achieve good traversal performance, but, at the same time, too many graph nodes should not be on the same server because this may result in heavy and concentrated load (http://www.journalofcloudcomputing.com/content/2/1/22#B50). An example of a graph partitioning approach is http://www.airccse.org/journal/ijcses/papers/0212ijcses09.pdf
What are the current trends in SOA for embedded systems?
In recent years service-orientation has become a serious architectural pattern for distributed embedded systems. Projects like DPWS have become quite popular. What do you think are the recent trends in this area and what may a research roadmap for SOA in the embedded world contain?
Frédéric Pinel · University of Luxembourg
Microservices may interest you ( http://martinfowler.com/articles/microservices.html ), it is the combination of REST services and the Unix design philosophy. Apparently Netflix's A. Cockcroft initially mentioned it ( http://www.infoq.com/interviews/Adrian-Cockcroft-Netflix ). He (AC) also advocated the use of mobile components in the data center (Arm SoC).
What is a distributed system? What are the testing aspect of distributed system?
Is there any limitation for testing distributed systems?
Dirman Hanafi · Universiti Tun Hussein Onn Malaysia
The distributed system is the system which the functional are distributed to each of the unit system. The testing aspects are consisted of synchronization, coordination, and communication.
Does anyone know about filesystem performance in linux?
Especially for ZFS and other filesystems, Including SmartOS?
Maciej Rostanski · Wyzsza Szkola Biznesu
Here's one article my students made with me. Maybe it's going to be helpful.
• What techniques are used in the detection and identification of performance bottlenecks in n-tier applications from a third-party perspective?
Deploying enterprise applications using a tiered architecture is increasingly becoming a standard both in virtualized and non-virtualised environments. A major challenge in managing such applications is the ability to be able to quickly detect performance bottleneck/anomaly/surprises and identify which of the tiers is causing the perceived constriction. The interest here is knowing what techniques have been used, or are being used for such tasks, especially in virtualised environment where applications are deployed in black-boxes (VM), and the infrastructure provider can only detect these changes and identify affected tier only by externally observing the performance of the applications. In this case, how can one use external system level performance "vital signs" of virtual machines to be able to quickly detect anomalies, and identify affected tier?
Olumuyiwa Ibidunmoye · Umeå University
Yes, the network infrastructure itself could be the bottleneck, I agree. But we are looking at the VM level, we want to be able to identify within a short space which VM is dragging performance and why. Well, do you have an idea about identifying network-induced bottlenecks?
• Concurrency Vs Parallelism
What is the difference between Concurrent Transmission/Processing vs Parallel Transmission/Processing ?
Jonathan Jewell · The Open University (UK)
An analogy at its most basic: I am ironing clothes whilst the washing machine is on. I am processing the items of clothing in turn, i.e. I can't iron a shirt and a towel at the same time. One comes after the other. The washing machine is running in parallel to my ironing. Nothing I do about the ironing, faster or slower, more of it, less of it, more people, less people working on the ironing, or whatever, makes the washing machine cycle go faster or slower. The two processes are running in parallel. While I routinely watch TV, a small amount of the resource I am using is concentrating on the TV and a small amount of concentration is going into the ironing. Suddenly something comes on TV which is really, really interesting or important. My concentrating on the TV increases markedly and I may even stop ironing to give it my full attention. This is concurrency. The TV and the ironing are going on at the same time but they are calling on the same resource. Turn the radio on at the same time as the TV and the resource for concurrent processing is overwhelmed. We might need to concentrate on the radio then switch our attention back to the TV - certainly we are not going to be continuing with the ironing (unless especially deft). Breaking the management of these tasks up is the same sort of thing that we see when we start talking about pre-emptive and co-operative multitasking, but there are other ways to do the best of what we can when we are sharing a finite resource amongst tasks. Meantime the whole process of ironing, looked up as itself is either coming along more or less slowly dependent on the resource priority we attach to it. And the washing machine is happily doing its thing, not affected at all. It is effectively running on its own processor.
Deliberate your thought on negative edges in a graph.
How can we realize a negative weighted edge in real-world situations? The applications of negative weighted edges.
Rogier Noldus · Ericsson
I suggest we go one step further, namely edges with 'Complex weight'. Edges with complex weigh (Real value + Imaginary value) may be used to represent an electrical circuit comprising resistors, capacitors and coils. This can easily be represented in the Adjacency matrix. However, most of the commonly used graph metrics assume non-directional edges with unit-weight. Especially graph analysis tools such as Graph Spectra, Effective Graph Resistance, Assortativity etc. will become more complex. This is an area of ongoing (and interesting) research.
Is game theory a good tool to model interactions among tenants in a data center?
Sometimes tenants can have competing business interests and could try to starve each other's tasks by strategically requesting resources
Carlos Barreto · University of Texas at Dallas
The main problem might be model the profits/losses of each individual as a function of 1) the resources available to complete its own tasks and 2) the success/failure of an opponent to complete its tasks. This in an interesting and realistic scenario. For example, if two players are in a contest to solve the same problem, the game can be seen as a zero sum game, in which the player that solves the problem first is the winner. However, there might be rules in the resource allocation process that makes the problem more complex. In any case, this scenario might not be too realistic.
Are there any simulators for cloud load balancing?
Cloud computing.
Srinivas Jagirdar · Muffakham Jah College of Engineering and Technology
Thanks all
In your opinion, what is the best way to handle malicious nodes in Peer-to-Peer networks?
Several techniques can be applied in Peer-to-Peer networks to handle the presence of malicious nodes (Reputation Systems, Accountability, Distributed Consensus, etc...). Which one do you think has the best trade off between the capability to discriminate malicious nodes from honest ones, and the cost of the technique (in term of the number of messages for example). Obviously, knowing that none of the previous systems can fully decide (with a 100% accuracy) whether a node is malicious or not .
Natalia Miloslavskaya · National Research Nuclear University MEPHI
How can I simulate a cloud setup in which I can get data provenance and process it towards data integrity detection and tracking of cloud data?
I am new to research, my MS thesis topic is "Detection of data integrity violations of Cloud data using Data Provenance". I have developed an algorithm which will use the already recorded provenance for decisions making of data integrity violation detections. At this stage I have to implement it for getting results. My first question is can I do it through simulation or i have to use the real cloud setup?. if simulation is possible then which one will be helpful''. The second question is ''Is there any provenance recorder for cloud data from which i can get my required provenance information for decision making" Please provide me guidance and share the material in this regard as i am new to research as well as cloud. Any type of related suggestions are welcome.
Hamilton Turner · Virginia Polytechnic Institute and State University
Can anyone tell me which network simulator is good for implementation of network coding for p2p networks opnet, omnet++ or NS2?
.
Gábor Lencse · Széchenyi István University
There is a good survey of P2P simulators: http://www.ietf.org/proceedings/65/slides/P2PRG-1.pdf On slide 15, they compare the capabilities of the simulators concerning the total number of nodes may be simulated (per standard PCs, I think) and if they are able to do distributed simulation. There is no "number of nodes" value given for NS2, but is must be in the order of several times ten, surely below 100, which is the worst amoung all the examined ones. It is so, because it executes e.g. cisco images, threfore it can be very accurate, but requires a lot a computing power. (I would say, it is rather and emulator and not a simulator.) I have used OMNeT++ for for more than 15 years (though never for P2P simulation) and I have found it a convenient and flexible open source simulator. It also has a large user community (and a mailing list) thus you can easily get help it will you need it. It has really a LOT of models, I just searched the web for: omnet++ p2p file sharing and found e.g.: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.151.4015 OMNeT++ is free for academic purposes, but you will need a licence for commecial use.
Which is the preferable simulation tool for service overlay network?
How to build a service overlay network?
Afaq Ahmad · Sultan Qaboos University
Dear Umadevi K S, An article 'An extensible simulation tool for overlay networks and services' is good to start with. Below is the link via which you can access. For the full paper you can request to author rather than paying (if any) for downloads. http://dl.acm.org/citation.cfm?id=1529740
• Does anyone have experience with cloud computing isolation of vm and can advise on how to implement this on icancloud simulator?
How to use I can cloud simulator?
shawgi mahmoud abdelbagi mohamed · Huazhong University of Science and Technology
hi it look like you have icancloud simulator ,i want to use icancloud so i download it and i have face so many problem until i reach the stage of run it but when i run it as local c application it give me window asking me about which application i want to run but when i write icancloud and press ok it start running but it gives me the following error: OMNeT++ Discrete Event Simulation (C) 1992-2010 Andras Varga, OpenSim Ltd. Version: 4.1, build: 100611-4b63c38, edition: Academic Public License -- NOT FOR COMMERCIAL USE See the license for distribution terms and warranty disclaimer <!> Error during startup: Cannot open ini file `omnetpp.ini'. so please help me
Are there any mapreduce algorithms for big data?
Cloud and big data
Sasikumar Mukundan · Centre for Development of Advanced Computing
Phani: you may want to elaborate what you are looking for. Is it a general research interest? Do you have some specific tasks or computation that you want to perform? What aspect of big-data are you concerned about: volume, velocity, or variety?
What are the trends of Social Networks in Computer Science?
E-HealthCare based on Semantic Web Technologies
Guido Governatori · National ICT Australia Ltd
Two related hot topics in this area are access control and privacy
What are open-source frameworks implementing publish–subscribe approach over IP/UDP, dedicated for small embedded systems?
There is a commercial available implementation of RTPS protocol, a RTI Connext Micro, which is well scalable for small embedded systems with microcontroler. There is also open-source project ORTE (http://orte.sourceforge.net) that implement a RTPS protocol, but I haven't found its port for microcontroler based systems.
David Swords · University College Dublin
Take a look into the PEIS Middleware, I think they recently expanded it to include support for embedded devices. ftp://aass.oru.se/pub/jrd/publications/thesis_summary_jay_IT10.pdf
Why am I getting the following error when finding principal components in MATLAB: "Index exceeds matrix dimensions."?
Code is as follows: input_dir = 'E:\13-09-13Downloads\CroppedYale\CroppedYale\trainingset\'; image_dims = [48, 64]; filenames = dir(fullfile(input_dir, '*.pgm')); num_images = numel(filenames); images = zeros(prod(image_dims),num_images); for n = 1:num_images filename = fullfile(input_dir, filenames(n).name); img = imread(filename); % img = rgb2gray(img); % Conversion of colored images to greyscale img = im2double(img); % Thanks to this line the best match is shown. Without it it was just a white space. img = imresize(img,image_dims); images(:, n) = img(:); end %% Training % steps 1 and 2: find the mean image and the mean-shifted input images mean_face = mean(images, 2); shifted_images = images - repmat(mean_face, 1, num_images); % steps 3 and 4: calculate the ordered eigenvectors and eigenvalues [evectors, score, evalues] = princomp('images' , 'econ'); % step 5: only retain the top ‘num_eigenfaces’ eigenvectors (i.e. the principal components) num_eigenfaces = 11; evectors = evectors(:, 1:num_eigenfaces); I am using Windows x32 OS. Kindly suggest....
Namrata Karkera · VIT University
Thank you for reverting. I have traced the error for above code. In the function princomp(basically used to compute principal components) the parameter image was been placed in single quotes, hence the program considered it as a string instead of matrix and produced some garbage values...
What are the current issues with the operating system?
Seeking special issues in "Distributed File System??" May any body help me or give suggestion in any relevant articles??
Sri Krishnan · Centre for Development of Advanced Computing
Hi Abdul Haleem SL, Some Special issues in Distributed File System (In My View) are as follows, 1.Latency and Synchronization of Data 2.Variety of Algorithms to distribute and Pile the Data 3.Handling data across variety of Time Zones 4.Security 5.Authentication and Verification across Nodes 6.High Availability across Nodes 7.Way to Store the Data (Structuring,Compression and Decompression) etc - Srikrishnan.V
Are there any tools to build a cloud locally (open-source and easy to deploy cloud services) for a research project?
.
Sri Krishnan · Centre for Development of Advanced Computing
Hi Sofiane Bendoukha, I would like to share my view on this. If you are a beginner, go ahead with Eucalyptus. Eucalyptus has a wide community support (Open Source Edition) . Eucalyptus offers Fast Start where no burden of Installation.It just requires Configuration. If you are in Intermediate level, you shall try Cloud Stack. If you an Expert, you shall try OpenStack, Open Nebula etc. Each Cloud Software offers unique features. - Srikrishnan.V