Using learning-to-rank to precisely locate where to deliver packages

Models adapted from information retrieval deal well with noisy GPS input and can leverage map information.

For delivery drivers, finding the doorstep where a package should be dropped off can be surprisingly hard. House numbers can be obscured by foliage, or they might be missing entirely; some neighborhoods use haphazard numbering systems that make house numbers hard to guess; and complexes of multiple buildings sometimes share a single street address.

LTR for GPS.png
In this figure, the dark-blue circles represent the GPS coordinates recorded for deliveries to the same address. The red circle is the actual location of the customer’s doorstep. Taking the average (centroid) value of the measurements yields a location (light-blue circle) in the middle of the street, leaving the driver uncertain and causing delays.

Having the correct latitude-longitude coordinates of the customer’s doorstep would make delivery more efficient, but that information is hard to come by. When a driver confirms a delivery, our app records the current GPS location, which could be anywhere between the customer’s door and the Prime delivery van parked down the road. Plus, in city “canyons”, where line of sight to GPS satellites is severely limited, the GPS measurement error can be substantial.

In a paper I’m presenting at the European Conference on Machine Learning (ECML), I adapt an idea from information retrieval — learning-to-rank — to the problem of predicting the coordinates of a delivery location from past GPS data.

In experiments, I compared the new method to two other approaches to the problem — centroid computation and kernel density estimation (KDE) — and found that the new method significantly outperformed its predecessors. On delivery data from New York state, the error for the new method was far less than that for KDE, the best-performing baseline. 

Learning-to-rank

In the information retrieval context, learning-to-rank is a way to learn from pairwise preference data. If a search engine presents a ranked list of results, clicking only the third search result implicitly provides two pairwise preferences: the user prefers the third search result to the first result, and the user also prefers the third result to the second result. That provides two labeled preference pairs that can help train a ranking model to improve future search results for other queries.

Similarly, I train a ranking model to select the best point from a set of candidate locations for a particular address. A single labeling click on the best location implies a preference order over (nearly) all pairs of candidate locations; the candidate closest to the labeled location is preferred. Thus, the volume of training pairs produced per labeling click is much greater than in information retrieval. The base binary classifier model takes pairs of points as input and is trained to prefer the one closer to the labeled point.

There’s a difference between the popular learning-to-rank methods in information retrieval and my adaptation of them, however. In the search engine scenario, the algorithm may be sorting through tens of thousands of documents or products to produce a ranking. Although the model is trained by pairwise comparisons, at inference time it doesn’t have enough time to compare each candidate document against all the others. Instead, it works as a regression model to score each candidate independently, and the final ranking is simply sorted by score.

In the geospatial case, however, we typically compute offline, and we usually have fewer than 100 sufficiently distinct candidate locations to consider for each delivery address. This makes it practical to compare each candidate location against all the others at inference time: we select the one that wins the most pairwise comparisons. Experiments showed that this produced better results than regression models trained from pairwise data, such as RankNet.

The primary machine learning model I used was a random forest — an ensemble of, say, 30 decision trees learned from the training data. Each decision tree performs a series of evaluations on a select set of data attributes (features) to produce a score. The average of all the trees’ scores is the model’s overall score for a given input item, which indicates a preference for one or the other of two candidate points.

Experimental results

To generate candidate locations, we first thin out the many GPS delivery locations reported in the past: only a point or two will represent a tight cluster of points. Then we add potential candidates along the faces of nearby buildings. 

The feature vector for each candidate includes features based on the density of past GPS measurements in its vicinity, as well as features based on nearby map data. These include things like the distance to the nearest street, the distance to the nearest parking lot, the distance to the nearest building, etc. Among other things, these types of features help the model not to select locations in the middle of a street or parking lot, which a centroid or KDE model might easily do (as in the figure above). There are also context features that help, such as the number of past deliveries and the number of building outlines nearby. 

The ability to leverage a variety of informative features explains the ranking model’s advantage over the baselines. Not all areas have reliable map data, but the many addresses on which my model’s performance is strongest probably do. Since the baseline models can’t exploit that information, the gap in performance is greater.

Technically, the first baseline I consider is a mediod method, which selects the candidate location closest to the centroid of the past GPS measurements. Centroid, mediod, and geometric-median methods all make the faux pas of selecting a low-density point in the middle of a multimodal distribution. 

The KDE method avoids this by finding dense clusters of past GPS points, but often the ground truth doorway is at the edge of the cluster, not in the middle. Thus, we really need a supervised machine learning method that can leverage many different signals, including information from the online map. The method is somewhat resilient to places in which the map is incomplete — for instance, missing building outlines or even roads.

I also compared the new model against an oracle that always selects the candidate location closest to the true location of the doorway; unless you have an omniscient candidate generator or generate all points, you won’t always have a candidate that’s super close to the ground truth label. 

Below are plots showing the performance of the novel ranking model (GeoRank) against the mediod and KDE methods, the oracle, and, as an upper bound, random selection. They are shown for two different sets of data, one for New York state and one for Washington state. The y-axis is loss, so lower is better; the x-axis reveals the whole distribution of losses (instead of a single scalar average per method).

Geospatial-coordinate loss.png

As can be seen, the GeoRank model significantly outperforms the baselines and compares favorably against the oracle. By putting this work into practice, we have improved delivery efficiency substantially, with benefits for both Amazon customers and delivery drivers for Amazon Last Mile. Please read the paper for more detail.

Research areas

Related content

IL, Tel Aviv
Come join the AWS Agentic AI science team in building the next generation models for intelligent automation. AWS, the world-leading provider of cloud services, has fostered the creation and growth of countless new businesses, and is a positive force for good. Our customers bring problems that will give Applied Scientists like you endless opportunities to see your research have a positive and immediate impact in the world. You will have the opportunity to partner with technology and business teams to solve real-world problems, have access to virtually endless data and computational resources, and to world-class engineers and developers that can help bring your ideas into the world. As part of the team, we expect that you will develop innovative solutions to hard problems, and publish your findings at peer reviewed conferences and workshops. We are looking for world class researchers with experience in one or more of the following areas - autonomous agents, API orchestration, Planning, large multimodal models (especially vision-language models), reinforcement learning (RL) and sequential decision making.
IL, Tel Aviv
Are you a Masters or PhD student interested in a 2026 Internship in Data Science? If so, we want to hear from you! We are looking for a customer obsessed Data Scientist Intern who can innovate in a business environment and is comfortable owning data to drive step-change innovation in the EMEA region or worldwide. If this describes you, come and join our Data Science teams at Amazon for an exciting internship opportunity. If you are insatiably curious and always want to learn more, then you’ve come to the right place. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science Key job responsibilities As a Data Science Intern, you will have the following key job responsibilities: • Work closely with scientists and engineers to develop new algorithms to implement scientific solutions for Amazon problems • Design, run, and analyze A/B tests • Work on an interdisciplinary team on customer-obsessed research • Experience Amazon's customer-focused culture • Create and deliver projects that can be quickly applied starting locally and scaled to EMEA/worldwide • Create and share data with audiences of varying levels technical papers and presentations • Define metrics and design algorithms to estimate customer satisfaction and engagement A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships or 6-12 months for part time internships. Please note these are not remote internships.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities - Design, develop, and evaluate innovative machine learning and deep learning models for natural language processing (NLP), recommendation systems, and personalization. - Conduct hands-on data analysis and build scalable ML pipelines. - Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. - Collaborate with software development engineers to deploy models into high-scale, real-time production environments.
US, CA, San Francisco
Join Amazon's Frontier AI & Robotics team as a Member of Technical Staff, this Technical Program Manager will become the driving force behind breakthrough robotics innovation. You'll orchestrate complex, cross-functional programs that bridge AI research, software, hardware, and production deployment—managing the technical workstreams that enable robots to see, reason, and act in Amazon's warehouse environments. Your program leadership will directly accelerate our mission to build the next generation of embodied intelligence. Key job responsibilities · Establish and drive program management mechanisms and cadence for complex robotics and AI development initiatives spanning research, software engineering, hardware, and operations · Manage end-to-end program execution across the full robotics stack—including AI models, software engineering, and hardware deployment · Drive decision-making velocity by facilitating tradeoff discussions when there are conflicting priorities; determine whether decisions are one-way or two-way doors · Own program-level risk management, proactively identifying technical, schedule, and resource risks; escalate where necessary and drive mitigation strategies · Manage dependencies and scope changes across internal teams and partner organizations, ensuring alignment on commitments, timelines, and technical requirements · Create transparency through clear RACI frameworks, program dashboards, and communication mechanisms that keep stakeholders aligned on status, risks, and decisions · Exercise strong technical judgment to influence program-level decisions on deployment methodology, scalability requirements, and technical feasibility—acting as the voice back to research and engineering teams · Build sustainable program management processes that scale as our organization grows, adapting agile frameworks to the unique challenges of AI robotics A day in the life Your focus centers on driving velocity and alignment across our robotics programs. You might start your morning facilitating tradeoff decisions between AI researchers and software engineers on a critical prototype milestone, then transition to managing dependencies across hardware and operations teams to keep timelines on track. In the afternoon, you could be conducting risk assessments on supply chain constraints that impact our development roadmap, updating program dashboards to provide leadership visibility, or working with partner teams to align on deployment strategies. You'll establish the mechanisms and cadence that keep our fast-moving organization synchronized—from sprint planning rituals to cross-functional design reviews. Throughout the day, you balance hands-on program execution with strategic escalation, ensuring technical decisions align with our long-term vision while removing obstacles that slow teams down. You're the connective tissue that enables researchers, engineers, and operations specialists to move fast together. About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, CA, San Francisco
We are seeking a hands-on Electrical Engineer to lead the design and integration of electrical systems or subsystems for high-degree-of-freedom robotic platforms. This role involves architecting the robot’s power distribution, sensor wiring, and embedded electrical infrastructure. You will be responsible for designing across the full electrical system for advanced robotics platforms including power distribution, sensing, compute, motor controllers, communication infrastructure, battery system and power electronics in close collaboration with mechanical, controls and software engineers. You’ll play a key role in ensuring high-performance, reliable operation of complex electromechanical systems under real-world conditions. Key job responsibilities * Electrical system architect / owner for power electronics, actuation, PCBAs, battery, ware harness specs and high speed electrical/communications protocols * Design, develop and integrate power distribution, embedded electronics, motor controllers and safety-critical circuits for complex robotic systems * Own board layout of PCBAs including SoCs, microcontrollers, sensors, power devices, etc. using Cadence OrCAD/Allegro or equivalent tools. Oversee bring-up and validation * Determine appropriate high speed electrical and communication protocols (e.g., CAN, EtherCAT, USB, etc) for reliable and efficient system operation * Specify and design custom power electronics and power distribution boards to meet performance, thermal, and safety requirements * Design and route all cabling and wire harnesses across the robotic platform, considering EMI, signal integrity, serviceability, and integration with mechanical structures * Architect and integrate the robot’s battery system, including protection circuitry, battery management, charging systems, and thermal considerations * Define and implement wiring and electrical interfaces for sensors (e.g., lidar, stereo cameras, IMUs, tactile) and compute modules * Ownership over prototyping and bringing up electrical designs and creation of test & validation rigs About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through innovative foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, NY, New York
We are seeking an Applied Scientist to develop and optimize Visual Inertial Odometry (VIO) and sensor fusion systems for our intelligent robots. In this role, you will design, implement, and deploy state estimation and tracking algorithms that enable robots to understand their position and motion in real time, even in challenging and dynamic environments. You will own the full pipeline from algorithm development through embedded deployment, ensuring that perception systems run efficiently on resource-constrained robotic hardware. You will also leverage modern machine learning approaches to push the boundaries of classical perception methods, combining learned representations with geometric techniques to achieve robust, real-time performance. This is a deeply hands-on role. You will work directly with sensors, hardware, and real-world data, while prototyping, testing, and iterating in physical environments. The ideal candidate has strong foundations in VIO and sensor fusion, practical experience optimizing algorithms for embedded platforms, and familiarity with how modern deep learning is transforming perception. Key job responsibilities - Design and implement Visual Inertial Odometry algorithms for robust real-time state estimation on robotic platforms like Sprout - Develop multi-sensor fusion pipelines integrating cameras, IMUs, and other sensing modalities for accurate pose tracking - Optimize perception and tracking algorithms for deployment on embedded hardware (e.g., ARM, GPU-accelerated edge devices) under strict latency and power constraints - Apply modern ML-based perception techniques (learned features, depth estimation, neural odometry) to complement and improve classical geometric approaches - Build and maintain calibration, evaluation, and benchmarking infrastructure for perception systems - Collaborate with hardware, controls, and navigation teams to integrate perception outputs into the robot’s autonomy stack - Lead technical projects from research prototyping through production deployment
US, WA, Bellevue
The candidate in this role will own delivery of science products and solutions to help Amazon Devices Sales and Marketing org. make better decisions: product recommendations to customers, segmentation, financial incrementality of marketing initiatives, A/B testing etc. Key job responsibilities The Amazon Devices organization designs, produces and markets Echo Speakers, Kindle e-readers, Fire Tablets, Fire TV Streaming Media Players, Ring and Blink Smart Home & Security products. We are constantly looking to innovate on behalf of customers with new devices in existing or new categories or improving customer experience on existing platforms. The Devices Data Services (DDS) team provides Data Science, Analytics and Engineering support to the broader organization to enable Sales and Marketing activities across all these product lines. We are looking for an innovative, hands-on and customer-obsessed Data Scientist who can be a strategic partner to the product managers and engineers on the team. Our projects span multiple organizations and require coordination of experimentation, economic and causal analysis, and building predictive machine learning models. A successful candidate will be a problem solver who enjoys diving into data, is excited by difficult modeling challenges, is motivated to build something that will eventually become a production software system, and possesses strong communication skills to effectively interface between technical and business teams. In this role, you will be a technical expert with massive impact. You will take the lead on developing advanced ML systems that are key to reaching our customers with the right recommendations at the right time. Your work will directly impact the success of Amazon's growing Devices business. You will work across diverse science/engineering/business teams. You will work on critical data science problems, building high quality, reliable, accurate, and consistent code sets that are aligned with our business needs. Key Performance Areas - Implement statistical or machine learning methods to solve specific business problems. - Improve upon existing methodologies by developing new data sources, testing model enhancements, and fine-tuning model parameters. - Directly contribute to development of modern automated recommendation systems - Build customer-facing reporting tools to provide insights and metrics to track model performance and explain variance - Collaborate with researchers, software developers, and business leaders to define product requirements, provide analytical support, and communicate feedback A day in the life You will work with other scientists, engineers, product managers, and marketers to develop new products that benefit our customers and help us reach our business goals. You will own solutions from end to end: conceptualization, prioritization, development, delivery, and productionalization. About the team We are a full stack science team that empowers product, marketing, and other business leaders to better understand customers who use Amazon devices, make decisions on product development or optimization, and measure the effectiveness of their efforts against our customer’s expectation. Our focus area is to build analytical frameworks that help the organization either access data, better understand the decisions customers are making and why, or assess customer satisfaction.