Case Study: Predictive analysis of SAMRoute around schools

This study evaluates the pre-diagnostic capability of the SAMRoute road risk score around 7,500 schools. It demonstrates how SAMRoute can effectively identify high-risk points of interest.

#intro

SAMRoute Pre-Diagnosis of Road Safety Around 7,500 Schools in France

Analysis of SAMRoute's Ability to Pre-Diagnose the Relative Danger of Road Surroundings of Points of Interest

SAMRoute is an explainable generative AI solution that provides pre-diagnostic safety assessments and risk modeling for transport infrastructure, such as road accident risk.

In this case study, we will use SAMRoute's pre-diagnostic capabilities to analyze the road surroundings of 7,500 schools located in Western France, covering a population of 10 million inhabitants.

This study explores three questions related to the use of SAMRoute:

1️⃣ Can we a evaluate such a large number of points of interest?

2️⃣ How accurate is our pre-diagnostic assessment?

3️⃣ And what does a high-risk site identified by SAMRoute look like?
Video: Practical case and analysis of road surroundings for 7,500 schools in France
#solution

SAMRoute: SaaS solution for proactive road safety

An explainable model for assessing the risk of accident before they occur

SAMRoute is a SaaS solution that helps road infrastructure stakeholders analyze and model the relative safety of roads before accidents occur, relying on multiple recognized data sources and explainable models.
In a few key points, SAMRoute offers :
  • A scoring system that evaluates the annual risk of accidents (incidence rate) every 100 meters, based on a multi-factorial model (34 factors) tailored to the specific risk profile of different users (cars, motorcycles, bicycles, pedestrians).
  • Linear scoring which, unlike complex and opaque models from deep learning (AI), produces decomposable, transparent, and explainable predictions.
  • 72.5 million kilometers of roads scored from OpenStreetMap.
  • Accident history for retrospective analysis (FR, NL, UK, US).
  • Satellite data layers from NASA and ESA to provide context on the environment near roads.
  • Transport routes and stops in France (GTFS) to identify intermodal hubs.
  • Street-level imagery with annotations from META/Mapillary (FR, US/CA+NY).
Figure. Data sources used
#method

Evaluation of the pre-diagnostic performance of SAMRoute

Road risk score near 7,500 school of Western France

We evaluate the ability of the SAMRoute risk score to identify schools with dangerous surrounding roads. The protocol is as follows:

1️⃣ For each school, within a 500-meter radius, we calculate the average risk score and check for any recent fatalities (2019-2022).

2️⃣ We repeat a ROC analysis (Receiver Operating Characteristic) fourteen times, varying a binary classification threshold across the range of risk scores. Each ROC analysis produces a graph and an AUC (area under the curve) metric.

3️⃣ Next, we select the 90% threshold (top-10%). A school with a score above the top 10% is pre-diagnosed as at-risk.

4️⃣ We calculate various metrics, including Spearman’s correlation and the positive predictive value (PPV), which is the proportion of correct predictions in the at-risk category.


The results are represented as follows:
  • fourteen graphs (qq-plot, ROC, confusion matrix) and
  • a table (PPV).
Figure. Protocol to evaluate performance
#results

SAMRoute identifies high-risk zones

Positive correlation between the scoring and the number of accidents, and predictive performance of the score

The table presents one row per analysis. Each row includes the region, department, the number of schools, the number of fatalities, as well as three metrics: prevalence, PPV (positive predictive value), and LR+ (positive likelihood ratio). Sorting by the PPV column shows that 8 departments have a PPV above 90%, while the other 6 fall between 66.1% and 89.1%.

➡️ SAMRoute thus successfully pre-diagnoses schools with dangerous surrounding roads to some extent.

The graphs display the analyses by department. They show a positive correlation between the number of accidents and SAMRoute scores, with Spearman coefficients (ρ) ranging from 0.52 to 0.80 for p < 0.1%.
Graph components:
1️⃣ A QQ-plot that reveals the relationship between the number of accidents (x-axis) and the average school score (y-axis), with each point representing a school.

2️⃣ A graph showing LR+ across different thresholds.

3️⃣ A PR curve displaying the relationship between precision and recall.

4️⃣ A ROC curve illustrating the trade-off between the true positive rate (sensitivity) and the false positive rate at different thresholds.

5️⃣ A summary of analysis parameters includes the radius, road users factored into the risk score, selected threshold (for the confusion matrix and PPV), observed period, severity levels considered, road users involved in accidents, and event type.

6️⃣ A section reporting contextual estimates: the number of POIs, total area (km2), area with an event (km2), number of accidents, and prevalence (%).

7️⃣ A confusion matrix including AP (actual positive), AN (actual negative), PP (predicted positive), PN (predicted negative), sensitivity, specificity, PPV (positive predictive value), and NPV (negative predictive value).

8️⃣ A summary table presenting metrics such as: TPR (true positive rate), TNR (true negative rate), FPR (false positive rate), FNR (false negative rate), LR+ (positive likelihood ratio), DOR (diagnostic odds ratio), ACC (accuracy), Spearman correlation, and Type I and Type II errors.
Results of the risk analysis around schools by department, 2019-2022Calvados (14), Normandy, FranceSarthe (72), Pays-de-la-Loire, FranceOrne (61), Normandy, FranceSeine-Maritime (76), Normandy, FranceVendée (85), Pays-de-la-Loire, FranceMorbihan (56), Brittany, FranceMayenne (53), Pays-de-la-Loire, FranceManche (50), Normandy, FranceMaine-et-Loire (49), Pays-de-la-LoireLoire-Atlantique (44), Pays-de-la-LoireIlle-et-Vilaine (35), Brittany, FranceFinistère (29), Brittany, FranceEure (27), Normandy, FranceCôtes-d'Armor (22), Brittany, France
#take-aways

Key takeaways of the case study

SAMRoute delivers significant value to road infrastructure managers

Discriminative power
SAMRoute’s scoring model shows a Likelihood Ratio Positive (LR+) ranging from 1.45x to 23.49x, demonstrating its strong ability to differentiate high-risk from low-risk areas.
Identify risk zones
SAMRoute effectively identifies high-risk areas around schools, with a Positive Predictive Value (PPV) exceeding 90% in most departments.
Evaluate proactively
Using predictive risk scoring, SAMRoute enables stakeholders to act before accidents happen, minimizing both human and economic costs.
Cost Efficiency in Resource Allocation
By focusing on the top 10% high-risk sites, SAMRoute ensures optimal resource allocation for safety improvements.
Data-driven decision making
SAMRoute leverages reliable data sources (NASA, ESA, OpenStreetMap, etc.) to provide transparent, explainable models, ensuring decisions are based on concrete data.
More trust and accountability
SAMRoute’s transparent predictions help infrastructure managers better communicate with the public, building trust in safety measures and supporting transparent decision-making.
#get-started

The reasons to select SAMRoute

— Why SAMRoute could be an innovative solution to improve the safety of your organization's transport infrastructure?

Because:
  • SAMRoute evaluates 100% of the road network, including often overlooked segments.
  • You are not limited by administrative boundaries.
  • You stay ahead of new regulations, such as the EC 2019/1936 directive, which mandates proactive road assessment for major roads by 2024.
  • By identifying high-risk segments early, you have the chance to avoid human and material costs resulting from accidents.
  • The risk score from SAMRoute is linear and explainable, unlike AI models that are non-linear and opaque.
Get a demo
See what SAMRoute can do for you
Tell us about yourself and we'll connect you with an expert from SAMRoute who can share more about the product and answer your questions.