Create fast and accurate building and land use classification algorithms
Compete for $100,000 in prizes
Intelligence analysts, policy makers, and first responders around the world rely on geospatial land use data to inform crucial decisions about global defense and humanitarian activities. Historically, analysts have manually identified and classified geospatial information by comparing and analyzing satellite images, but that process is time consuming and insufficient to support disaster response.
The Functional Map of the World (fMoW) Challenge seeks to foster breakthroughs in the automated analysis of overhead imagery by harnessing the collective power of the global data science and machine learning communities. The challenge will publish one of the largest publicly available satellite-image datasets to date, with more than one million points of interest from around the world. The dataset contains satellite-specific metadata that researchers can exploit to build a competitive algorithm that classifies facility, building, and land use.
IARPA is conducting this Challenge to invite the broader research community of industry and academia, with or without experience in deep learning and computer vision analysis, to participate in a convenient, efficient and non-contractual way. Participants will develop algorithms that scan satellite data to identify functions based on multiple reference sources, such as overhead and ground-based images, digital elevation data, existing well-understood image collections, surface geology, geography, and cultural information. The goals and objectives of this Challenge are to:
Register for the challenge at Topcoder.com/fmow/.
IARPA has released a large satellite imagery dataset with training, validation, and testing imagery subsets to support the fMoW Challenge. The visualization tool and benchmark example can be found here. The testing and training data are availabe for download via two options: 1) Amazon Web Services (AWS) and 2) BitTorrent. Below are detailed instructions for downloading the data:
To obtain the data via AWS, you must utilize Requestor Pays. The data is available in two versions:
A full set of AWS CLI resources can be found here: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html
Some example commands appear below:
There is a manifest.json.bz2 file in each bucket that can be downloaded to get a json that lists everyfile in the bucket
Commands like these can be used to get a directory listing
To obtain the data via BitTorrent, will require you to download, install, and ensure the correct configuration of your own BitTorrent client. Once you have a client installed and properly configured, downloading the data sets is relatively simple. Make sure you have a copy of the full data set, then with the torrent files of your choosing downloaded, simply open them within the client of your choosing and follow your client’s instructions to begin downloading the data.
We encourage contestants who choose to use this BitTorrent to continue to seed the data after their download is complete to help build and maintain the healthy population of seed nodes.
Provisional scoring will be based on your submission of results against the test set and will be evaluated by the Topcoder Marathon test system. Scoring will utilize an algorithm that calculates an F-score for each category and then uses a weighted-average of these scores to determine an overall F-score. Your provisional score will be displayed on the Provisional Leaderboard.
The top 10 competitors, according to the provisional scores at the end of the challenge, will be invited to the final testing round. For a full description of the provisional and final scoring rules and criteria, please visit the registration site.
All participants will have access to IBM Watson and IBM Bluemix for 90 days during the challenge (though these are not required and we welcome all types of solutions for the challenge). Top competitors will periodically receive access to AWS cloud computing resources to improve their algorithms, more information can be found in the TopCoder forums.
The challenge submission period will end. The final score shown on the Provisional Leaderboard at the end of the challenge will be used to determine solver rankings going into the final evaluation. The top 10 algorithms will be scored against a hidden data set, and the top scoring solutions will be validated by the IARPA team for award. Final scores will be posted to the leaderboard on Topcoder and shared through official IARPA communications.
The challenge winners will be invited to present their solutions to IARPA and other key leaders in the Government at a workshop in Washington, DC, and cash awards will be distributed to winners. Final public communication about winners will take place during this day-long workshop.
Participants will be eligible to win cash prizes from a total prize purse of $100,000. Additionally, top winners will get a chance to present their winning solutions at a workshop in Washington, D.C. Prizes will be distributed for the following criteria:
This solution uses an ensemble of models based on Dual Path Networks (Chen, 2017) and the JHU/APL baseline. Model development was focused on improving image-only performance, after which special procedures were added to account for temporal views, context, dataset bias, and weighted scoring.
View on GitHubUsing the MXNet framework, this solution uses an ensemble of models based on ResNet and ResNeXt, combined with the JHU/APL baseline. One of the ResNet models used is also initialized with weights derived from fine-tuning on the Places-365 Challenge data. Additional steps were taken to better handle context and segment the datasets for training.
View on GitHubThis solution offers a framework for creating ensembles of CNNs, called Hydra. In this instance, Hydra is used to coarsely train two instances of ResNet and DenseNet. These coarsely-trained weights are then used to instantiate multiple additional models for further fine-tuning. Data augmentation steps, including flips, zooms, shits and varying types of crops, were used to prevent overtraining.
View on GitHubThe full dataset package consists of over 1 million images from over 200 countries and includes imagery and metadata originally sequestered from the competition dataset release. For each image, we provide at least one bounding box annotation containing one of 63 categories. This package has been split into two versions: fMoW-full and fMoW-rgb. fMoW-full is in TIFF format, contains 4-band and 8-band multispectral imagery, and is ~3.5TB in size. fMoW-rgb is in JPEG format, all multispectral imagery has been converted to RGB, and it is ~200GB in size. In addition to obtaining the data, we have also posted information on how to obtain the baseline algorithm and visualization tool.
View instructions for downloading the full datasetAnyone 18 years or older is eligible to register. Certain individuals and groups with existing agreements with IARPA, IARPA government partners, and their affiliates are welcome to participate in the challenge, but will need to forego the monetary prizes, but may compete for standing on the leaderboard and other non-monetary incentives. Participants can also form teams to collaborate on solutions. For a complete list of rules and eligibility requirements go to the registration site.
To assist builders in this challenge, IARPA has gathered a list of resources to prepare for each stage in the challenge. You can find sample datasets, reference capture methods, evaluation guides, and technical documents to support your submissions.
Large repository of images with associated bounding boxes
http://www.image-net.org/Standardized image data sets for object class recognition
http://host.robots.ox.ac.uk/pascal/VOC/Satellite data competition: Understanding the Amazon from Space
https://www.kaggle.com/c/planet-understanding-the-amazon-from-spaceChallenge from the Defence Science and Technology Laboratory (Dstl) to accurately classify features in overhead imagery
https://www.kaggle.com/c/dstl-satellite-imagery-feature-detectionUC Merced image dataset with 21 classes with 100 images per class
http://vision.ucmerced.edu/datasets/landuse.htmlSpaceNet online repository of freely available satellite imagery
http://explore.digitalglobe.com/spacenetPublicly available benchmark for Remote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU)
http://www.escience.cn/people/JunweiHan/NWPU-RESISC45.htmlBenchmark Dataset for Performance Evaluation of Aerial Scene Classification
http://www.lmars.whu.edu.cn/xia/AID-project.htmlGoogle image dataset of mainly urban areas of China
http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.htmlTorontoCity benchmark providing different perspectives of the world captured from airplanes, drones and cars driving around the city
https://arxiv.org/abs/1612.00423Comprehensive review of the recent progress in dataset progress. Propose a large-scale dataset, termed "NWPU-RESISC45"
https://arxiv.org/abs/1703.00121This guide gets you started programming in TensorFlow to help build you neural net
https://www.tensorflow.org/get_started/get_startedA cheat sheet guide containing many neural network architectures and their explanations
http://www.asimovinstitute.org/neural-network-zooA listing of Amazon AI Services that can be used on the challenge
https://aws.amazon.com/amazon-aiBenchmark Algorithm and code used for the fMoW Challenge
https://github.com/fMoWPackages intended to assist in the preprocessing of SpaceNet satellite imagery data corpus to a format that is consumable by machine learning algorithms.
https://github.com/SpaceNetChallengeDeepCore Machine Learning Abstraction Framework is a utility toolkit that allows a user to download, perform either image classification or object detection, and manipulate geospatial vector files.
http://deepcore.io/Learn more at www.iarpa.gov
Reach out directly at functionalmap@iarpa.gov
Find us on Twitter at @IARPAnews & #IARPAfMoW
The Intelligence Advanced Research Projects Activity (IARPA) invests in high-risk, high-payoff research programs to tackle some of the most difficult challenges of the agencies and disciplines in the Intelligence Community (IC)