This dissertation focuses on the collaboration of multiple heterogeneous, intelligent agents (hardware or software) which collaborate to learn a task and are capable of sharing knowledge. The concept of collaborative learning in multi-agent and multi-robot systems is largely under studied, and represents an area where further research is needed to gain a deeper understanding of team learning. This work presents experimental results which illustrate the importance of heterogeneous teams of collaborative learning agents, as well as outlines heuristics which govern successful construction of teams of classifiers. A number of application domains are studied in this dissertation. One approach is focused on the effects of sharing knowledge and collaboration of multiple heterogeneous, intelligent agents (hardware or software) which work together to learn a task. As each agent employs a different machine learning technique, the system consists of multiple knowledge sources and their respective heterogeneous knowledge representations. Collaboration between agents involves sharing knowledge to both speed up team learning, as well as to refine the team's overall performance and group behavior. Experiments have been performed that vary the team composition in terms of machine learning algorithms, learning strategies employed by the agents, and sharing frequency for a predator-prey cooperative pursuit task. For lifelong learning, heterogeneous learning teams were more successful compared to homogeneous learning counterparts. Interestingly, sharing increased the learning rate, but sharing with higher frequency showed diminishing results. Lastly, knowledge conflicts are reduced over time, as more sharing takes place. These results support further investigation of the merits of heterogeneous learning. This dissertation also focuses on discovering heuristics for constructing successful teams of heterogeneous classifiers, including many aspects of team learning and collaboration. In one application, multi-agent machine learning and classifier combination are utilized to learn rock facies sequences from wireline well log data. Gas and oil reservoirs have been the focus of modeling efforts for many years as an attempt to locate zones with high volumes. Certain subsurface layers and layer sequences, such as those containing shale, are known to be impermeable to gas and/or liquid. Oil and natural gas then become trapped by these layers, making it possible to drill wells to reach the supply, and extract for use. The drilling of these wells, however, is costly. Here, the focus is on how to construct a successful set of classifiers, which periodically collaborate, to increase the classification accuracy. Utilizing multiple, heterogeneous collaborative learning agents is shown to be successful for this classification problem. We were able to obtain 84.5% absolute accuracy using the Multi-Agent Collaborative Learning Architecture, an improvement of about 6.5% over the best results achieved by Kansas Geological Survey with the same data set. Several heuristics are presented for constructing teams of multiple collaborative classifiers for predicting rock facies. Another application utilizes multi-agent machine learning and classifier combination to learn water presence using airborne polar radar data acquired from Greenland in 1999 and 2007. Ground and airborne depth-soundings of the Greenland and Antarctic ice sheets have been used for many years to determine characteristics such as ice thickness, subglacial topography, and mass balance of large bodies of ice. Ice coring efforts have supported these radar data to provide ground truth for validation of the state (wet or frozen) of the interface between the bottom of the ice sheet and the underlying bedrock. Subglacial state governs the friction, flow speed, transport of material, and overall change of the ice sheet. In this dissertation, we focus on how to construct a successful set of classifiers which periodically collaborate to increase classification accuracy. The underlying method results in radar independence, allowing model transfer from 1999 to 2007 to produce water presence maps of the Greenland ice sheet with differing radars. We were able to obtain 86% accuracy using the Multi-Agent Collaborative Learning Architecture with this data set. Utilizing multiple, heterogeneous collaborative learning agents is shown to be successful for this classification problem as well. Several heuristics, some of which agree with those found in the other applications, are presented for constructing teams of multiple collaborative classifiers for predicting subglacial water presence. General findings from these different experiments suggest that constructing a team of classifiers using a heterogeneous mixture of homogeneous teams is preferred. Larger teams generally perform better, as decisions from multiple learners can be combined to arrive at a consensus decision. Employing heterogeneous learning algorithms integrates different error models to arrive at higher accuracy classification from complementary knowledge bases. Collaboration, although not found to be universally useful, offers certain team configurations an advantage. Collaboration with low to medium frequency was found to be beneficial, while high frequency collaboration was found to be detrimental to team classification accuracy. Full mode learning, where each learner receives the entire training set for the learning phase, consistently outperforms independent mode learning, where the training set is distributed to all learners in a team in a non-overlapping fashion. Results presented in this dissertation support the application of multi-agent machine learning and collaboration to current challenging, real-world classification problems.