Science topic

Knowledge Discovery - Science topic

Explore the latest questions and answers in Knowledge Discovery, and find Knowledge Discovery experts.
Questions related to Knowledge Discovery
  • asked a question related to Knowledge Discovery
Question
17 answers
From the announcements of politicians, it appears that in a few years manned expeditions to the Moon will resume. On the Earth's moon, bases will be established for the preparation of a man's expedition to Mars, which may already be carried out in a dozen or so years. Manned expeditions to Mars in the perspective of a dozen or so years have already announced the US and China.
So there is a rivalry in the field of who will first make a manned trip to Mars. Will it be a similar competition as in the 1970s between the US and the USSR in the area of ​​the then competing programs of the planned and implemented manned expedition to the Moon.
Or maybe a manned expedition to Mars will not be a competition between the biggest economic and technological powers of the world? Perhaps these planned further cosmic manned expeditions will be international expeditions, will be implemented as part of international cooperation and co-operation, and the crew of cosmonauts will be a crew composed of representatives of various countries?
What is your opinion on this matter? Should a manned space expedition to Mars be conducted as part of the international competition between competing countries or rather as part of international cooperation and cooperation?
Do you agree with me on the above matter?
In the context of the above issues, I am asking you the following question:
Is the planned space manned expedition to Mars a result of the rivalry between economic powers or their cooperation?
Please reply
I invite you to the discussion
Thank you very much
Best wishes
Relevant answer
Answer
Dear Stephen,
Thanks for the answer. This kind of technological competition may contribute to the emergence of many new technological innovations and the transition of civilization to the next stage of the present fourth technological revolution.
Thank you very much,
Best regards,
Dariusz Prokopowicz
  • asked a question related to Knowledge Discovery
Question
162 answers
Do you think that man will ever leave our solar system?
Please, answer, comments.
I invite you to the discussion.
Best wishes
Relevant answer
Answer
Well, it seems like a utopia to me, but why not? Primitive man did not imagine that airplanes were created and one day could fly and it was achieved. So perhaps, within centuries, this purpose can be achieved as well.
  • asked a question related to Knowledge Discovery
Question
6 answers
PhD degree is a highest recognized degree, currently considered as one of the requirements for several jobs like academic, scientific, managerial and other positions that are highly paid and respected in the scientific society.
Therefore, there is a obvious race to pursue PhD from the best of the eligible universities and to land up in a well established lab, to publish as much papers as possible and finally get PhD degree in shortest possible span of time.
Do you think it is wrong ? There is nothing wrong in competing with the above objectives but what I feel is all concerned about the blind race which eventually limits the quest for learning, enjoying the pursuit of knowledge discovery, being creative, confident and end-up with the self discovery.
Thus, this tricky race has lead to increased stress, anxiety and depression among research students. Most of the time, I realized that even after the PhD degree, people end-up breeding egos rather than humility which lands them to the mediocracy. However, this mediocracy keeps them moving up the ladder. Actually, each PhD journey is unique and should not be subjected to the comparison to other PhD's.
What is your overview of the current state of the PhD's ?
Relevant answer
Answer
Stephan J. Hauser Your reply gives a perfect picture of what usually goes in a researchers mind under various uncertainties and pressure to get published.
  • asked a question related to Knowledge Discovery
Question
27 answers
What technologies, currently considered surrealistic and possibly achievable in the future, could revolutionize technological progress and could solve key global problems of humanity?
For example, imagine if nuclear fusion technology was developed at a nuclear power plant, in which nuclear material would neutralize toxic waste in this way.
Or if power plants were built, in which hydrogen would be burned in oxygen in a controlled way and in this way electricity would be produced and the whole power plant would be built in the formula of a new generation steam engine, with water vapor arising in the process of the said hydrogen combustion in oxygen.
If it were possible, this type of technological solutions currently considered surreal and unreal could solve global problems being a derivative of the increase in greenhouse gas emissions, the ever-accelerating process of global warming and the growing demand for electricity.
What proposals do you have, surrealistic concepts for the surreal technologies of the future?
What do you think about this topic?
What's your opinion on this topic?
Do you agree with me on the above matter?
What technologies, currently considered surrealistic and possibly achievable in the future, could revolutionize technological progress and could solve key global problems of humanity?
What other ideas do you know about surrealistic technologies, known for example from science fiction novels and films, which if they became real, could they solve various key global problems of humanity?
Please reply
I invite you to the discussion
  • asked a question related to Knowledge Discovery
Question
33 answers
Our previous research suggested that experts’ perceptions related to the existence of expertise in standardization in small and medium enterprises (SMEs) significantly differ.
While some experts argued that expertise resides in all SMEs, others strongly suggested that expertise does not exist in SMEs. Assuming that owners and managers of SMEs must have had some level of expertise that enabled them to start their business activities and successfully carry them forward, given their lack of resources and capabilities, the question still remains - does expertise in standardization exist in SMEs?
What do you think - does expertise in standardization actually exist in SMEs?
Relevant answer
Answer
SME used to have much more serious priorities, than standardization (which is certainly a good and desirable thing). So that even having a decent expertise in standardization, it may be not a current priority, not speaking of a case, if such expertise is absent within the company. Typically, standardization is highly dependent on the budget availability and the urgency of the moment. Eventually, some companies are coming to that point, unless they disappear before it.
  • asked a question related to Knowledge Discovery
Question
4 answers
Many approaches propose combined alternatives in this sense. For instance, using logic programing (Prolog), rough sets, and other more joined with models. However, it would be interesting to understand more practical strategies that promote a way of extracting the rules of inferences produced in the tree-based models.
Relevant answer
Answer
  • asked a question related to Knowledge Discovery
Question
10 answers
What are issues in data mining?
Data mining is not an easy task, as the algorithms used can get very complex and data is not always available at one place. It needs to be integrated from various heterogeneous data sources. These factors also create some issues. Here in this tutorial, we will discuss the major issues regarding −
  • Mining Methodology and User Interaction
  • Performance Issues
  • Diverse Data Types Issues
It refers to the following kinds of issues
  • Mining different kinds of knowledge in databases − Different users may be interested in different kinds of knowledge. Therefore it is necessary for data mining to cover a broad range of knowledge discovery task.
  • Interactive mining of knowledge at multiple levels of abstraction − The data mining process needs to be interactive because it allows users to focus the search for patterns, providing and refining data mining requests based on the returned results.
  • Incorporation of background knowledge − To guide discovery process and to express the discovered patterns, the background knowledge can be used. Background knowledge may be used to express the discovered patterns not only in concise terms but at multiple levels of abstraction.
  • Data mining query languages and ad hoc data mining − Data Mining Query language that allows the user to describe ad hoc mining tasks, should be integrated with a data warehouse query language and optimized for efficient and flexible data mining.
  • Presentation and visualization of data mining results − Once the patterns are discovered it needs to be expressed in high level languages, and visual representations. These representations should be easily understandable.
  • Handling noisy or incomplete data − The data cleaning methods are required to handle the noise and incomplete objects while mining the data regularities. If the data cleaning methods are not there then the accuracy of the discovered patterns will be poor.
  • Pattern evaluation − The patterns discovered should be interesting because either they represent common knowledge or lack novelty.
There can be performance-related issues such as follows
  • Efficiency and scalability of data mining algorithms − In order to effectively extract the information from huge amount of data in databases, data mining algorithm must be efficient and scalable.
  • Parallel, distributed, and incremental mining algorithms − The factors such as huge size of databases, wide distribution of data, and complexity of data mining methods motivate the development of parallel and distributed data mining algorithms. These algorithms divide the data into partitions which is further processed in a parallel fashion. Then the results from the partitions is merged. The incremental algorithms, update databases without mining the data again from scratch.
Diverse Data Types Issues
  • Handling of relational and complex types of data − The database may contain complex data objects, multimedia data objects, spatial data, temporal data etc. It is not possible for one system to mine all these kind of data.
  • Mining information from heterogeneous databases and global information systems − The data is available at different data sources on LAN or WAN. These data source may be structured, semi structured or unstructured. Therefore mining the knowledge from them adds challenges to data mining.
Relevant answer
Answer
For Data Structure & Data Base Management System .
  • asked a question related to Knowledge Discovery
Question
1 answer
As we know, most of the researchers use manual validation by the experts for the unlabeled User Reviews for a specific domain , but is there a new way? Because I worked with big sized dataset and using experts will be difficult?
if anyone use a new performance measure or a new way for validation, plz inform me .
Thanks in advance.
Relevant answer
Thanks Gopi Battineni , I will check the paper.
  • asked a question related to Knowledge Discovery
Question
10 answers
As we know, most of the researchers use manual validation by the experts on a domain, but is there a new way? or any benchmark, if anyone has a benchmark dataset for this task in any domain plz provides to me if possible. Thanks in advance.
Relevant answer
Answer
Very interesting regarding the knowledge of ITES - Following .
  • asked a question related to Knowledge Discovery
Question
4 answers
I'm doing my dissertation on designing a model on Big Data knowledge discovery. I have used a questionnaire to gather the model components. now I want to know how can I validate the extracted model?
Relevant answer
Mohammed Elmogy thank you
  • asked a question related to Knowledge Discovery
Question
8 answers
Should the instruments of motivating scientists and employees of companies to come up with new ideas, research concepts and innovations be improved in management processes?
Yes, in my opinion every scientific concept, idea, innovative solution, technological improvement, streamlining improvements, etc. after implementation into production should be properly rewarded. Every originator, inventor, etc. should receive a salary. It is necessary in a modern growing economy. In addition, instruments of motivating originators, scientists, also people employed in companies in various positions, should be constantly perfected to create well-designed financial instruments and / or non-financial remuneration for ideas, ideas, innovations, patents, improvement solutions, technological improvements, etc. implemented with positive effects. It is necessary that new ideas, ideas, innovations, etc. are created as much as possible and that they are implemented on a production scale. In the 21st century, especially pro-ecological innovations are particularly sought-after and necessary. In this way, the scope of cooperation, relationship between science and business is increased, and this is necessary in modern knowledge-based economies.
In the light of the above, encouraging discussion, I turn to you with the following question:
Should employee motivation instruments be invented in the management processes to come up with new ideas, research concepts and innovations?
Do you know the latest concepts of improving the instruments of motivating scientists and employees of companies to come up with new ideas, research concepts, innovations, ...?
Please, answer, comments. I invite you to the discussion.
I wish you the best in New Year 2019. Best wishes
Relevant answer
Well, if you are seeking the involment of differents organization levels and colaborators, you need desing a flexible motivation tools; a not exhaustive list can include:
- public recognition: not exist "good ideas" o "bad ideas", only exist ideas, and the "winner concepts" are the addition of multiple, varied an inusual ideas. Is a good practice make visible the conjunct of small contribution in the innovation process.
- time to innovate: people need time to innovate, time to "fast-wrong", time to experiment.
-small awards: how a fuell of the process, you can incentive the creativity with small incentives how: cinema tickets; a wine bottle; a invitation to good restaurant. Well; this small incentives are not replacement of monetary incentives, but they can contribute to keep ligth the flame.
- monetary incentives; but, in this case, I consider the group prizes are better mean to motivate the team work on the individual talent
  • asked a question related to Knowledge Discovery
Question
8 answers
What are the procedures that we can implement in Transformation step?
Relevant answer
Answer
Dear Deeman,
Balan has stated correctly. The steps in figure that you showed are very useful indeed.
  • asked a question related to Knowledge Discovery
Question
5 answers
The difference between these two designations lies in the type of approach used: artificial intelligence for Knowledge Discovery in Databases (KDD) with the use of heuristics from symbolic learning, statistics for Data Mining (DM) considered as an industrialization data analysis techniques.
Relevant answer
Answer
Lotfi Nabli-
The link in the first answer (from Dibakar Pal) is only missing the final 'l' in the URL. If you add it in your address bar, you should be able to see the site.
  • asked a question related to Knowledge Discovery
Question
1 answer
there are different knowledge discovery can be implement in smart meter measurement. Are you interested in this type of analytic? or visulazation
Relevant answer
Answer
Following
  • asked a question related to Knowledge Discovery
Question
3 answers
Medical data mining has great potential for exploring the hidden patterns in the data sets of the medical domain. These patterns can be utilized for clinical diagnosis. However, the available raw medical data are widely distributed, heterogeneous in nature, and voluminous. These data need to be collected in an organized form. This collected data can be then integrated to form a hospital information system. Data mining technology provides a user-oriented approach to novel and hidden patterns in the data. Data mining and statistics both strive towards discovering patterns and structures in data. Statistics deals with heterogeneous numbers only, whereas data mining deals with heterogeneous fields. We identify a few areas of healthcare where these techniques can be applied to healthcare databases for knowledge discovery. In this paper we briefly examine the impact of data mining techniques, including artificial neural networks, on medical diagnostics.
Relevant answer
Answer
Dear Maysam Toghraee,
Though the question sounds different, it is a very important query being asked by you. So far I can understand by only reading your question, ANN can be considered as an important tool in data mining domain. Thus, it has a tremendous impact for mining data.
  • asked a question related to Knowledge Discovery
Question
3 answers
is there any article discuses the correlation and linkage between them ?
Relevant answer
Answer
KDD is type of exploratory data analysis, a sub type of data analysis.
The most common form of data analytics historically would have been confirmatory data analysis (CDA) where you start with a hypothesis, design a study to falsify it, collect data in that study then analyse it to determine it it supports or does not support the hypothesis. The drawback of this approach is identifying useful hypotheses to test can be difficult, often requiring high levels of technical insight combined with creativity.
KDD :
1) starts with no assumptions, just data.
2) does not lead to conclusions
3) it generates hypotheses. Due to the open ended nature of KDD it can generate any number of hypothesis so multiple hypothesis testing means the alpha required also becomes more stringent as more options are explored.
4) A significant analysis in KDD cannot be used for decision making without some form of robust validation.
KDD is open to the many same risks as CDA but in a greatly magnified way as the data is often less rigorously collected and curated than in a more focused dataset built for CDA. Data biases, high noise, missing data, format standardisation and many more imperfections can give false positives. This means the alpha level for acceptance should be set to be tough on potential false positives.
  • asked a question related to Knowledge Discovery
Question
3 answers
Hi,
I'm looking for a free tool to recognize the terminology concepts in technical domains such as computer science and engineering.
Is there any available dictionary, gold standard or such a tool to do that? why there is no much research in this direction?
Thank you,
  • asked a question related to Knowledge Discovery
Question
4 answers
if so will this research challenge be tightly coupled with knowledge discovery. its well known that a picture is worth a thousand words. therefore there is a need for automated interpretation methods to aid end users who are illiterate as it were
Also a list of researchers working in this area can be listed. thanks
Relevant answer
Answer
Yes, it can. More importantly, misinterpretation and lack of readability of VA/visualization can lead to serious consequences. I'd recommend you take a look at Edward Tufte's texts on how the Columbia accident was fostered by similar problems (try reading the second chapter from his book Visual Explanations. There is also a book on The Cognitive Style of PowerPoint - you can find a chapter here: https://www.google.com.br/url?sa=t&source=web&rct=j&url=https://www.inf.ed.ac.uk/teaching/courses/pi/2016_2017/phil/tufte-powerpoint.pdf&ved=0ahUKEwidusa9uZDXAhWEFpAKHcqtC_EQFgiAATAF&usg=AOvVaw21e8wg8IIEs8Mq_7tYJ7Aj
  • asked a question related to Knowledge Discovery
Question
4 answers
In machine learning, the trained model is sometimes too complicated thus it may be hard to run the model due to resource limitation (e.g., mobile devices). Some research proposes to use other models to approximate the complicated ones. However, my question is how can we make sure the  approximation makes sense?
Reference: 
Buciluǎ, C., Caruana, R., & Niculescu-Mizil, A. (2006, August). Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 535-541). ACM.
Relevant answer
Answer
Maybe one way to think about it in theoretical terms is as a projection operation. The original model's functional form constrains what it can learn and - assuming you have fit its parameters to a reasonable result - what you are trying to do is project the learned representation to a different, simpler and thus less expressive class of functions.
This is sure to result in loss of accuracy but the question is whether in the cases you care about i.e. examples drawn from the distribution that you are going to use it in production, that loss is meaningful. Hinton had a really interesting presentation on what he dubbed dark knowledge that might be of interest to you:
  • asked a question related to Knowledge Discovery
Question
2 answers
For example, the range of knowledge generated while doing your Phd thesis.
Relevant answer
Answer
I think it may be similar to the way in which we understand critical thinking.  My rubric is attached.
  • asked a question related to Knowledge Discovery
Question
3 answers
Information credibility depends on some of the information quality attributes; mainly accuracy, relevance, coherence and authenticity. Credibility is classified as presumed, reputed, surface, or experienced property. Credibility is gained and lost based on the quality of information being delivered as well as perception and interpretation of information. With current explosive rate of data generated by different business processes as well as magnitude of published information on the web internet information are identified as resources. These resources lack quality verification mechanisms. Achieving higher levels of information quality assurance requires formalization of information architecture and organization processes. Do you agree? If so, how should we resolve this problem?
Relevant answer
Answer
Constantin and Michael, 
Many thanks for taking the in taking part in this dialogue.
I believe, in order to enhance the quality of information, specifically, the relevancy and accuracy dimensions, information ought to be classified. On the accuracy dimension, little can be done based on the current architecture level; however, on the relevancy dimension, many enhancements can be deployed.
Credibility can be achieved by classifying information based on a specified criteria that ought to be defined in a framework. This is where I see a good role for researches to define the criteria in a cohesive frame work.
  • asked a question related to Knowledge Discovery
Question
2 answers
What are the possible ways of knowledge discovery from extracted gene disease associations from biomedical text
Relevant answer
Answer
  • asked a question related to Knowledge Discovery
Question
4 answers
The user contributions in the form of data and information are processed by using crowd-sourced human computations to generate knowledge in a knowledge management system.
Need pointers to similar research and any formal approaches to describe the process of knowledge creation in community crowds.
Relevant answer
Answer
@Franz Plochberger. The user contributions are artifacts containing explicit knowledge produce by humans using their cognition. Machine computations are used to facilitate the distribution of tasks or for capture, sharing, etc. human response. Kindly see the attached link .
  • asked a question related to Knowledge Discovery
Question
7 answers
Dear RG members?
Does someone know guidelines for identifying Knowledge Gaps in Health Research?
In the literature, there are some systematic reviews about the knowledge gasps in some areas (e.g. cardiovascular diseases); but there are not clear guidelines about how to identify those knowledge gaps.
Thank you.
Relevant answer
Answer
Dear Andreas
This is a really interesting question that raises many issues! A few thoughts strike me.
i) Where there is existing evidence knowledge gaps can be defined by uncertainties directly related to this evidence in relation to the questions it asks- e.g. Evidence about a treatment for X is insufficient to prove benefit
ii) Knowledge gaps are defined by an assessment of what knowledge is needed. 
Within systematic reviews, this implicitly happens when review questions are identified. Gaps might be identified by the absence of this evidence. In this respect knowledge gaps are defined by an interaction between experts and the evidence because experts define the knowledge that is required. However, expertise is loosely defined and there is no 'method' to identify the questions for which answers are required. Effectively this process happens implicitly with no 'method' other than peer reviewers agreeing that the questions for the review were sensible in the first place.
However, the emerging field of research prioritisation takes this one step further and involves the users of research in identifying the answers that are needed, with an assessment of what evidence is already available informing the ultimate prioritisation. This seems to me to address the issue you are grappling with. However, it is a complex consultative process in and of itself with multiple stakeholders involved.... The 'James Lind Alliance' has developed an approach that is now being adopted  for the National Institute for Health Research in the UK - see the link below...
  • asked a question related to Knowledge Discovery
  • asked a question related to Knowledge Discovery
Question
6 answers
After several decades of exponential growth of studies on complexity, it seems to me that society remains on the sidelines, as if all research and serious advances were considered a mere academic debate far from people's real needs. What should the complexity research community do to permeate society?
Relevant answer
Answer
Even though the critical mass is rapidly increasing (seminars, conferences, papers, books all around the world) it is clear that complexity science remains still as alternative thinking. It is far from being "normal science". No problem with that.
The truth is that normal science has a clear intention to coopt complexity science, but it can't. It has already coopted, say, complex thinking (= Morin), system science and some close fields. I personally find very troublesome that many complexologists do see non-lieanrity but they work on to via analytical methods. Pum! They kill thus non-linearity.
As Katherine says, education is a feasible and necessary road/tool. Yet, I am not convinced that it is the only one.
Let me please put it in the following terms: vis-à-vis the systemic and systematic crises that going on out there, we need a more radical mind set. That is certainly not system thinking. It is complex science - at large. Hence, a set of actions and plans are to be undertaken, I guess.
  • asked a question related to Knowledge Discovery
Question
9 answers
Dear All, I need little help at starting my research work, I want to develop knowledge based event detection system from social media big data based on text messages and spatio-temporal information, I mean the system should have ability to tell about what is happening? what and where? in some region for example in London, The system should tell about for example emergency situation, fire, earthquake, traffic jam, raining, disease, volleyball or football match etc.
so my question, is it possible to detect such things from social media? for example from twitter.
if yes, then what kind of data mining and machine learning algorithms which i have need to use it for starting my research work? please can anyone help me in my question. Thank you in advance for your answer.
Best Regards,
Siraj
Relevant answer
Answer
Hi Siraj
Twitter incudes GIS information with the authorisation of the user, that is if the twit is sent from mobile with GPS you will have the information. 
If you want to start with Twitter you need to register as developer. For Yelp but this will perhaps not so interesting in your case you got the position of the place.
For text analysis SVM gives good results in classification, the other area is Sentiment analysis but this could not apply in your case, but if you look for panic emotion for example it could be of interest, the algorithm behing is Naive Bayes in most of the cases. One issue with sentiments could be the processing time.. few days if you have a significant corpus size and the Lexicon you use or have to rebuild in case of specific words. 
If you work in R you should check two packages RTextTools and tm, you can build the Corpus, do the preprocessing and build the test to document with those packages and then feed in your models.  
Hope this help.
Alain
  • asked a question related to Knowledge Discovery
Question
11 answers
General apriori algorithm performs separate passes for each episode length, building "episodes relations structure" level by level.
I am curious is there algorithm that uses apriori principle and performs only one pass: at begin it could track only single item episodes, and as it goes along the episodes it tracks more and more complex episodes?
Relevant answer
Answer
Hello Oleg,
Up to my knowledge, there is no algorithm that is based on Apriori's principle working in only one pass. Apriori is basically an enumerative algorithm: the way the candidates are enumerates is the core of such an algorithm. Your idea is at the opposite of such an approach: it would consist in the enumeration of all the candidates at once, and then check for their representativity. But the search space (i.e. the number of candidates) being generally so huge, there is no operational solution to such an approach: the basic idea of Apriori is precisely to reduce the search space step by step, so that the candidates at one step allows to determine the sub-space that is out of interest for the next step.
As an illustration, I developed a new KDD process called TOM4L that works in one step. So TOM4L is not an enumerative approach: it is an inductive process based on the combination of induced binary relations. Up to my knowledge, TOM4L is the only KDD approach working this way. It produces better models that the Apriori-like algorithms, and much more faster because it requires only one reading of the database.
I don't know why you asked this question but I am sorry if I disappointed you with my answer. Hoping that someone will contradict me!
Best regards,
Marc
  • asked a question related to Knowledge Discovery
Question
4 answers
Does anybody have an example of research conducted in one area of science that when cross pollinated with research outputs from a different field of research, led to a breakthrough in a completely new area.
I am specifically looking for examples of research being conducted in two completely different areas and with no obvious connection that when brought together, by whatever means, led to a new discovery, process, or solution to an outstanding problem.
Relevant answer
Answer
I think that that there are many potentially fruitful combinations. For example, economics and physics. But I think that both coauthors should have at least elementary education in both sciences, otherwise they may not understand each other, also due to different terminology. I have an article about that on RG:
Yegorov Y. (2007) Econo-physics: A Perspective of Matching Two Sciences
I think that social sciences can be also matched with computer science, in particular, with the theory of networks. Here, if you are a social scientist and can formulate good problem (like propagation of drugs), you might not know computer science, but find a guy, who can make simulation with networks, study their statistical properties, etc.
  • asked a question related to Knowledge Discovery
Question
7 answers
Hi guys,
I'm still newbie in big data analysis.
I'm currently looking to do incomplete data analysis for the big data in R rattle package.
I refer this book for my reference to do an analysis but it is not specifically focus on the big data.
I wish to get your knowldege sharing or opinion to do analysis the big data:
1) Any recommended data set that pretty enough with the big data size?
2) What is best size of data set that we should considered as a big  dataset?
3) Any the best recommended tutorial to practice for the incomplete big data analysis ? 
Thank you in advance
Relevant answer
Answer
Hi,
     First, you will have to understand 3V of Big Data. Choice of  tool for data exploration. You must have very keen knowledge in Machine learning Basic Technique to build up co-efficient system!
You are welcome to ask more if you will have specific query.
Regards
  • asked a question related to Knowledge Discovery
Question
1 answer
At the moment, I was able to find these papers:
1. Prototype a Knowledge Discovery Infrastructure by Implementing Relational Grid Monitoring Architecture (R-GMA) on European Data Grid (EDG) by Frank Wang, Na Helian, Yike Guo, Steve Thompson, John Gordon.
2. Knowledge grid-based problem-solving platform by Lu Zhen, Zuhua Jiang,Jun Liang.
Thank you in advance for any help.
Relevant answer
Answer
Hello Pawel,
Look into the attached paper. & this link for the application.
This paper gives you the well structured idea about application and extension of the Grid technology to knowledge discovery in Grid databases.
If you are working on larger Datasets, I'm certain OLAP could help you with it and provide better results than any other.
Furthermore, you also need to work on the performance results & usability of such applications.
Regards,
Manish
  • asked a question related to Knowledge Discovery
Question
13 answers
Dear all
how can I find medical rule bases(knowledge base)?Is their implemented languages important for dealing with?what are they?
Relevant answer
Answer
The largest and most comprehensive hierarchical vocabulary of medical terms is SNOMED Clinical Terms (SNOMED CT). It is more a knowledge taxonomy than an ontology, but it is currently the best knowledge base of general medical terms there is. 
References:
There are several other medical ontologies under development but they are significantly less expressive (i.e. smaller) than SNOMED. Some are academic and open/free, some are commercial. Additionally, a great deal of these projects is unfinished and the work on them stopped a while back.
  • Ontology for General Medical Science (OGMS), https://code.google.com/p/ogms/
  • GALEN and the "Galen-Core" high-level ontology for medicine.
  • GuideLine Interchange Format (GLIF), a computer-interpretable language for modeling and executing clinical practice guidelines that can be easily integrated into Protege ontology builder.
  • Collaborative Open Ontology Development Environment project (CO-ODE), Medical Informatics Group at the University of Manchester
  • LinKBase, knowledge base of over 1 million language-independent medical concepts featuring an ontology with a formal conceptual description of the medical domain, Language and Computing N.V., Belgium
  • The Medical Ontology Research program, Lister Hill National Center for Biomedical Communications. The aim was to develop a sound medical ontology to enable various knowledge processing applications to communicate with one another.
  • The ONIONS methodology, designed to build the ON9 medical ontology.
  • MedO, a bio-medical ontology developed at the Institute of Formal Ontology and Medical Information Systems, Germany.
My advice would be to first take a look at SNOMED, OGMS, GALEN and - by all means - GLIF.
Kind regards,
Marko
  • asked a question related to Knowledge Discovery
Question
1 answer
-
Relevant answer
Answer
KDM metamodel, which is a multipurpose standard metamodel that represents all aspects of the existing information technology architectures. The idea behind the standard KDM is that the community starts to create parsers from different languages to KDM. As a result every-thing that takes KDM as input can be considered platform and language-independent. For example, a refactoring catalogue for KDM can be used for refactoring systems implemented indifferent languages
  • asked a question related to Knowledge Discovery
Question
8 answers
Hi all, we're considering different libraries and approaches for the construction of large sized graphs (using Twitter data) - upon construction graph algorithms will be evaluated, so the purpose is more for knowledge discovery rather than visualization.
Currently, it takes 4 hours to generate a graph for a twitter data set that has 21 days worth of twitter data.
1. Which would you recommend, Node4J or Jung, for this purpose?
2. Would HaDoop be a good choice to spread the construction task across different machines in order to save time?
3. How can we store a very large graph in memory , and how can it be made persistent?
Relevant answer
Answer
Hi,
1) Right now I'm using MIT's libraries with Octave for compute different kinds of network's indicators. I'm using Gephi for visualization in parts (days or months) because my network is very huge (218 millions of records). I process data in parts (days or months), and stored through octave matrix .
2) Gephi have some functionality to share your visualization but I have not tried it yet.
3) Right now I'm testing another approach. I'm using MySQL as data store and using the functionality of Gephi to connect the data. It's seem works.
  • asked a question related to Knowledge Discovery
Question
3 answers
I am looking for such a tool that shows me the value of a certain built feature on a desired part of the corpus. I need to not only visualize this but also get the distribution of every feature across the corpus (maybe in vector form) as output, so that I can analyze it faster by code, rather than manually.
Anyone know of any tool or resource even a bit similar to what I am looking for?
Thanks in advance.
Relevant answer
Answer
If you are interested in latent features and probabilistic distribution of any text corpora, may be mallet http://mallet.cs.umass.edu/ useful.
  • asked a question related to Knowledge Discovery
Question
11 answers
I would be grateful if someone could point me to references for algorithms/approaches/techniques.
Relevant answer
Answer
You may also be interested by Affinity Propagation. (http://www.psi.toronto.edu/index.php?q=affinity%20propagation)
This algorithm makes very weak assumptions about your data: all you need to do is define some similarity between data points. However this similarity doesn't even have to be a proper distance function: it does not need to be symmetrical and not all data points need to be related.
You can look at the original Science paper (http://www.sciencemag.org/content/315/5814/972.short) for examples of applications, such as clustering faces, genes expression or flight connections.
  • asked a question related to Knowledge Discovery
Question
12 answers
I've done some research on the topic and, so far, it was possible to identify that the following patterns have been thoroughly explored:
1) Sequence Mining. Example: "A-B-C-D happened in 10% of the database". This can also include a time constraint between events.
2) Temporal Association Rules: "A,B,C->D (30%, 20%) between 7am and 10am". In this case, the temporal information is used to "slice" the database in n-parts, which are then used to extract traditional association rules.
My first impression is that many recent researchers are focused on efficiently performing either 1 or 2, or porting them to data stream environments. I am wondering if there are any other patterns that can be obtained from timestamped data, for instance, "A,B->C (20%, 30%)", such that A and B happens in any order in a time window of at least 10 hours and at maximum 20 hours before C.
If nothing similar exists, is it worth the effort to develop a new data mining method to extract patterns similar to this?
Also, if you know about any open datasets with the following characteristics, please let me know. I've tried UCI repository, but no success so far.
Events with their respectives timestamps for many individuals or sensors. Example:
1 A (2012-02-20) B (2012-03-23) C (2013-01-20)
2 B (2003-04-30) D (2004-03-20)
3 B (2010-09-10) A (2010-10-01) C (2010-10-02)
Relevant answer
Answer
Thanks for your answer Mr. Mukesh. I am not familiar with HMM and CRF, therefore could you please advise if these probabilistic models are able to incorporate information related to time distances between events? For example, assuming we are mining the data below, is there a way to parametrize any of these probabilistic methods, such that they can uncover patterns such as 'A occurs before B in about a month' based on entries #1 (A occurs before B [~32 days]) and #4 (A occurs before B [~28 days]).
#1 A (2012-02-20) B (2012-03-23) C (2013-01-20)
#2 B (2003-04-30) D (2004-03-20)
#3 B (2010-09-10) A (2010-10-01) C (2010-10-02)
#4 A (2010-01-10) B (2010-02-07)
Best Regards,
Heitor
  • asked a question related to Knowledge Discovery
Question
86 answers
There are many experienced prof. on this network. I need your guidance for effective teaching . Hope your sharing experience will helps not only me but also many lecturer.
Relevant answer
Answer
My teaching philosophy is to be passionate, enthusiastic, and engaging every time that I give a lecture so students are excited to learn more about the topic and become active learners. I don’t try to be the favorite teacher. Instead, I want to help the students to become self-sufficient, as scientists and physicians at the forefront of the field for the next half-century. I am most proud when I can make a positive difference in someone’s academic career and see the amazing accomplishments of those whom I have mentored. I truly believe that the future is in the hands of the students we lead toward careers as scientists and physicians. While in research we are only as good as our last experiment, the students we mentor and their future students are a legacy that will endure. Finally, Henry Ford once said, “anyone who stops learning is old, whether at age 20 or 80”. I truly believe learning is a lifelong process and my role as an educator is to instill lifelong learning in all my students
  • asked a question related to Knowledge Discovery
Question
9 answers
Opinions, recommendations and success stories can be beneficial for many. Can it be helpful for sharing tacit knowledge?
Relevant answer
Answer
Tacit knowledge has been discussed in various discourses, especially Knowledge Management, since Michael Polyani introduced the term in the late 1950s. Polyani used it in the sense that we can sometimes have knowledge (in particular, know-how) borne of experience that is not easily communicated or made explicit. Nonetheless, one of the pursuits of KM in its early years was how to "capture" such knowledge. Apprenticeship is used as a classic example of one way to acquire such tacit knowledge - through observation and practice. Mentoring is another proven pathway.
I have seen numerous attempts to "define" e-learning, some better than others & I've come to the conclusion that definitions are usually only useful for specific contexts. In some ways it's just a term, a construct, that has been adopted by many communities of practice. In some ways, all we need to do is to "make sense" of what is being said when this term is being used. Definitions are best left for formal documents like standards and reports.
For me, e-learning in the workplace can take place in all kinds of ways and doesn't have to be seen as part of a program of coursework, training or formal relationship & can happen informally through using a wide variety of tools & services available via the Web. I know I sometimes learn a lot through searching & I certainly learn from engaging with peers through social media.
  • asked a question related to Knowledge Discovery
Question
34 answers
Machine Learning vs. Data Mining.
Relevant answer
Answer
DM uses the techniques of ML (and other fields of AI), but DM also deals with visualization of data (big data), and in some way with methods for the storage/management/querying of big data. so DM and ML intersects, but no one is completely a subset of the other one.