Automatic Machine Learning

This request for information (RFI) is intended to provide information relevant to a possible future IARPA investment (such as a program or grand challenge). Respondents are invited to provide comments on the content of this announcement to include suggestions for improving the scope of a possible solicitation to ensure that every effort is made to adequately address the scientific and technical challenges described below. Responses to this request may be used to support development of, and subsequently be incorporated within, a future IARPA solicitation and therefore must be available for unrestricted public distribution. Neither proprietary nor classified concepts or information should be included in the responses. The following sections of this announcement contain details of the scope of technical efforts of interest, along with instructions for the submission of responses.

Background & Scope

Machine learning (ML) is used extensively in application areas of interest to IARPA including speech, language, vision, sensor processing, and multi-modal integration. Typically, expert practitioners in ML select appropriate architectures and algorithms for the application domain, performance requirements, and data characteristics of the problem at hand. Additionally, they engineer an appropriate set of features to be extracted from the data for use in the system design. Then, depending on the problem, data may be selected for training and scheduled for presentation to the system according to the requirements of the task. In some application areas, the data needed for training are extremely sparse, consisting of only a few instances, and important information may be missing, requiring the application of supplementary information and real-world knowledge for intelligent inference. In many other application areas, the amount of data to be analyzed has been increasing exponentially (sensors, audio and video, social network data, web information) stressing even the most efficient procedures and most powerful processors. Most of these data are unorganized and unlabeled and human effort is needed for annotation and to focus attention on those data that are significant.

The focus of this RFI is on recent advances toward automatic machine learning, including automation of architecture and algorithm selection and combination, feature engineering, and training data scheduling for usability by non-experts, as well as scalability for handling large volumes of data. Useful automatic machine learning systems will require significant innovations in the science and technology of machine learning, possibly including (but not limited to) hierarchical architectures like Deep Belief Nets and hierarchical clustering, methods for parallelization of computation, attentional mechanisms for focusing on data of significance, methods for transfer of previously learned knowledge to a new task, methods for incorporation of real-world knowledge to include human advisors and one-shot learning methods, methods to include different temporal scales and the effects of causality, the role of goals and environmental feedback in learning, and model selection from approaches like meta-learning.

Responses to this RFI will be used to help focus and organize an interactive workshop of selected machine-learning practitioners with the goal of eliciting plausible next steps through focused presentations of ideas and guided discussions.

Responses to this RFI should be as succinct as possible while providing specific information that addresses the following questions:

  1. What are your proposed methods for (a) automation of architecture and algorithm selection and combination, (b) feature engineering, and (c) training data scheduling? How will these automation methods affect the usability of an analytic system by non-experts?
  2. What are the compelling reasons to use your proposed approach in a scalable multi-modal analytic system?
  3. How will your approach handle different time scales, missing data, and sparse data?
  4. How will your approach be applied to diverse data, such as speech, language, vision, sensor processing, and multi-modal integration?
  5. How will you supplement training data with real-world and previously learned knowledge?
  6. What is known about your proposed approach? Please provide suitable references.
  7. What are the appropriate metrics to measure performance?
  8. What other solutions are being suggested to overcome the challenges in this RFI?
  9. What is the timescale needed to demonstrate progress?
  10. What are the data sets and other resources needed?
  11. Are supporting technologies readily available or does new technology need to be created?

The responses to this RFI will be used to help in the planning of a 1.5-day workshop on automatic machine learning. An expected result is the identification of promising areas for investment through vehicles like seedlings, grand challenges, and programs. It is anticipated that this workshop will be held in late March, 2012. A separate workshop announcement will be posted with further details.


Preparation Instructions to Respondents

IARPA solicits respondents to submit ideas related to this topic for use by the Government in formulating a potential program. IARPA requests that submittals briefly and clearly describe the potential approach or concept, outline critical technical issues, and comment on the expected performance, robustness, and estimated cost of the proposed approach. This announcement contains all of the information required to submit a response. No additional forms, kits, or other materials are needed.

IARPA appreciates responses from all capable and qualified sources from within and outside of the U.S. Because IARPA is interested in an integrated approach, responses from teams with complementary areas of expertise are encouraged. Responses have the following formatting requirements:

  1. A one page cover sheet that identifies the title, organization(s), respondent's technical and administrative points of contact - including names, addresses, phone and fax numbers, and email addresses of all co-authors, and clearly indicating its association with IARPA-RFI-12-01;
  2. A substantive, focused, one-half page executive summary;
  3. A description (limited to 5 pages in minimum 12 point Times New Roman font, appropriate for single-sided, single-spaced 8.5 by 11 inch paper, with 1-inch margins) of the technical challenges and suggested approach(es);
  4. A list of citations (any significant claims or reports of success must be accompanied by citations, and reference material MUST be attached);
  5. Optionally, a single overview briefing chart graphically depicting the key ideas.


Disclaimers and Important Notes

This is an RFI issued solely for information and new investment planning purposes and does not constitute a solicitation. Respondents are advised that IARPA is under no obligation to acknowledge receipt of the information received, or provide feedback to respondents with respect to any information submitted under this RFI.

Responses to this notice are not offers and cannot be accepted by the Government to form a binding contract. Respondents are solely responsible for all expenses associated with responding to this RFI. It is the respondents' responsibility to ensure that the submitted material has been approved for public release by the organization that funded whatever research is referred to in the response.

The Government does not intend to award a contract on the basis of this RFI or to otherwise pay for the information solicited, nor is the Government obligated to issue a solicitation based on responses received. Neither proprietary nor classified concepts or information should be included in the submittal. Input on technical aspects of the responses may be solicited by IARPA from non-Government consultants/experts who are bound by appropriate non-disclosure requirements.


For information contact:
Posted Date: November 30, 2011
Responses Due: January 27, 2012