Celebrating 25 years of DDD's Excellence and Social Impact.

Data Training

AutonomousDriving 1

4 Major Regulatory Hurdles in the Autonomous Driving Space

By Abhilash Malluru
March 13, 2023

Autonomous driving as a field is booming. As many automotive manufacturers integrate autonomous technologies in their vehicles, fully autonomous cars are becoming a mere stone’s toss away.

Regulations for autonomous driving typically focus on two key areas: safety and performance. This article is mostly focused on the regulatory and legislative hurdles regarding safety of automated driving and autonomous vehicles.

1. Liability and Autonomous Vehicles

No means of transportation is without its hiccups. And unfortunately, autonomous driving has had numerous fatal accidents, with eleven recorded in 2022 alone. Currently, all autonomous auto manufacturers are required to report accidents to the National Highway Traffic Safety Administration.

The points of failure in an autonomous vehicle are a little more nebulous, and concerns have surfaced about who is liable in an accident. Since the cars are not fully autonomous, the accident could be from driver carelessness. Or they could be a result of software malfunctions or mechanical failures. As the technology improves and cars become more autonomous, the accident liability will shift toward the manufacturers and developers. There is no clear-cut solution yet, as the issue has yet to mature.

A Problem of Interwoven Pieces

Autonomous vehicles are complex. There’s a lot of interconnectivity between the various pieces that power and control them. Some speculate that as liability shifts to developers and manufacturers, it will pose some severe hurdles to overcome per incident.

Those making the AVs must analyze every component of the vehicle and perhaps even divulge the proprietary software suites that power the car while assisting law enforcement.

2. Federal and State Regulations

The first road safety initiatives began years before computer chips ever graced automobiles. Much has changed in automotive technology since, but the regulatory bodies are slower to catch up. Currently, there isn’t a wide-sweeping federal regulation governing fully autonomous vehicles.

The NHTSA has made some provisions regarding autonomous vehicles and specific safety feature requirements. This is a positive sign since the safety features that auto manufacturers must include are congruent with autonomous vehicle technologies.

State Laws

Only 43 states out of 50 have legislation regarding automated vehicles. Some are restrictive, while others depend on each vehicle’s SAE automation level. Liability insurance factors into most of these laws since every state save Hawaii and Virginia requires it.

The other seven states haven’t enacted laws regarding autonomous vehicles, and there is no indication of when legislation might be drafted. Multiple states also require licensure for mandated drivers, adding another logistics drain onto larger fleet deployments.

Federal Laws

The only federal-level agency providing some oversight over autonomous driving is the previously mentioned NHTSA. Federal regulation currently stipulates safety features, not the deployment of large commercial autonomous vehicle fleets. This isn’t necessarily bad, but a lack of an overarching baseline may cause future headaches for manufacturers.

Limited federal regulations also mean manufacturers must consider various state laws when developing and deploying autonomous vehicles.

3. Cybersecurity of Autonomous Vehicles

Tech magnates worldwide have bolstered their cybersecurity after hard-learned lessons, including cyber attacks, extreme platform compromises, and significant money lost due to offline systems. Yet the nascent autonomous driving space hasn’t fully accounted for their lack of protected systems. And if a server goes down and a vehicle is compromised, the effects would not only be money and time lost but potentially lives.

Despite the technological marvels surrounding AVs, there isn’t much cybersecurity support. These vehicles have diverse means of connectivity, leaving many open attack vectors. For example, the Internet of Things (IoT) has long been a highly vulnerable method of communication. Many AVs communicate with smart devices in the home, and security measures haven’t yet been fully developed to address potential attacks.

Much could be done to bolster and harden the systems around autonomous vehicles. Encrypted digital transmission has been present in IoT for quite some time. Hardened entry points requiring user authentication could mitigate possible actions and deter bad actors.

A clear and effective incident response to a systems breach is now a necessity, as it provides a blueprint for how to respond to a compromised vehicle.

4. Data Usage and Privacy Concerns

Along with the lack of security, there is the question of what data auto manufacturers collect and how they use it.

We can expect manufacturers to collect performance metrics, but gathering personalized data presents grave privacy concerns. Regulatory bodies have already addressed the data collected in the medical, financial, and educational sectors. So perhaps it’s a matter of time before additional regulations develop regarding manufacturers’ collection and safeguarding of personal data.

Other concerns arise regarding what the companies do with the data collected from their autonomous vehicles. Location data gives a glimpse at the patterns and lifestyle of the operator of any autonomous vehicle, and it would be a simple step to leverage that data into marketing materials and betray the trust of a potential customer.

Current American legislature regulations regarding data collection could be adapted to provide some degree of security for user data. New legislature and regulations could further impact how manufacturers use the data gathered by AVs.

How to Position Your Enterprise at the Forefront of AD Policies

With all these concerns, how would you move forward?

Here are some steps that you can take to move forward and position your enterprise at the forefront of these policies and regulations.

  1. Liability: Your organization can handle self-reporting, which helps maintain paper trails for all incidents and prepares your staff to respond appropriately to any incidents.

  2. Federal and State Regulations: Maintaining liaisons with regulatory bodies nationwide could benefit your enterprise. It’s also best to adhere to good practices and industry-standard software stacks when approaching the development of these platforms.

  3. Cybersecurity: Cybersecurity has many glaring issues, but you could strengthen your organization by adopting some of the principles AI and ML companies use.

  4. Data: Software stacks could and should adhere to ISO standards regarding intelligent transport systems, like ISO 22737:2021. Data usage should be self-regulated, as there aren’t provisions for the safest practices concerning the protection of customer data.


Are you looking to integrate standard software solutions for your autonomous driving firm? Digital Divide Data provides data annotation services with SOC 2 Type 2 and ISO 27001 certification.

4 Major Regulatory Hurdles in the Autonomous Driving Space Read Post »

Car Aerial Annotated 01 1

Determining The New Gold Standard of Autonomous Driving

Car Aerial Annotated 01

By Abhilash Malluru
Feb 27, 2023

Autonomous driving is on the cusp of widespread adoption. As more manufacturers across the globe begin implementing AD systems in their vehicles, it is only a matter of time before it becomes a regular feature in future automobiles. And with the rise in popularity of AD systems comes a need for standardization.

Emerging standards are beginning to regulate how manufacturers approach navigation, safety, and AD modeling quality. These standards also influence policy creation, technology use, and the general framework for AD systems. Creating standard systems for these AD models will lead to a more uniform approach toward autonomous driving models.

An Overview of the Tech Behind Autonomous Driving

While the idea of autonomous driving dates back centuries with Leonardo da Vinci’s inventions, most of the tech has been developed in the last few decades. After Navlab5’s self-steering vehicle made headlines in the ’90s, autonomous driving really took off.

The first AD production vehicle started with Tesla’s Autopilot, an SAE 1 implementation that offered parking assistance and automated driver-assistive processes. Tesla doesn’t provide a fully autonomous platform for their production vehicles, but the Autopilot helped gauge interest in the general public.

Other manufacturers are also spearheading their own development of AD vehicles. For example, Volvo’s recent acquisition of Zenseact, a leading software and hardware developer for autonomous driving, shows the company’s commitment to producing a fully autonomous vehicle. Volvo has also started implementing more sophisticated technologies like LiDAR for its AD driving platforms.

LiDAR and other data annotation methods – like bounding boxes, polygons, and key points – have become ubiquitous in the autonomous driving space. These annotation methods rely on trained AI models with massive data sets that provide accurate information to the vehicle in real time so it can adapt and adjust to conditions on the road.

It’s extremely time-consuming to develop models, so there are still limitations, like a reliance on the driver to make crucial driving decisions. Still, this progress is leaps and bounds from where the earlier assistive processes were just a few years ago.

State governments in the United States have already convened and passed legislation regarding autonomous vehicles on public roadways. The most noteworthy is California, which has the most comprehensive regulations for autonomous vehicles. No federal legislature permits the deployment of fully autonomous vehicles yet. It operates more on a state-by-state basis.

The Standards Fueling AD’s Mass Adoption

Common methods and standards have grown around the autonomous driving industry. Some of these are just general classifications, and others go down to how the vehicles actually function. As the market around AD grows, it only makes sense that there are more robust systems taking hold to define how these vehicles should safely and effectively operate.

SAE and IEEE

SAE and IEEE have convened and already passed their own guidelines defining what autonomous vehicles are and how to classify them. IEEE has more exhaustive standards regarding safety on public roadways and connectivity between other cars. These aren’t necessarily driving the actual development behind Autonomous Driving. But they show that AD has reached a somewhat wide-scale acceptance among the various bodies developing the hardware and software that fuels it.

Simulations

Simulation is a vital method for developing and testing autonomous driving technology, enabling engineers and researchers to create a virtual environment that mirrors real-world conditions without putting people or property at risk. Simulation offers several benefits to developers, including cost-effectiveness, replicability, safety, scalability, and flexibility.

The cost of building and testing a physical vehicle can be high, but simulation can reduce expenses significantly. Simulating various driving scenarios in a virtual environment can help developers identify potential problems and make necessary adjustments without requiring physical testing, saving both time and money.

Simulations are highly replicable, meaning that a particular scenario can be repeated many times to test different algorithms, sensor configurations, or other variables. This enables developers to gather large amounts of data and draw reliable conclusions from their experiments, providing the necessary information to create efficient autonomous driving systems.

Simulation offers safety benefits as well. As autonomous driving technology is still in its early stages, testing in the real world can be risky. Simulating scenarios allows developers to test their technology in a safe environment, reducing the risk of accidents or injury.

Scalability is another benefit of simulation, as it can handle large amounts of data, allowing developers to test various algorithms and scenarios at the same time, while flexibility enables quick modification of variables and testing of different scenarios, reducing the time it takes to identify and address potential issues.

Vision Performance Standards

Much like the human driver behind the wheel, an autonomous vehicle needs a constant feed of visual data to interpret its environs. Visual performance is a crucial component behind autonomous driving and enables the car to recognize objects and react appropriately to them on the roadways. There are a few emerging standards empowering this innovation. For example, Intersection over Union (IoU), Average Precision (AP), and Mean Average Precision provide guidelines for visual processing implementation.

AP and IoU function similarly, dictating the visual detection system’s accuracy in predicting the movement of detected objects. Mean Average Precision can work like AP, but it looks at numerous data sets to effectively process visual detection.

System Implementation Standards

LiDAR is one of the many standard systems emerging behind autonomous vehicles. Beyond just bare visual processing and prediction, LiDAR helps accurately map a car’s surrounding environment. It isn’t intended for the predictive positioning of objects necessarily but provides a quicker and more accurate image using light. Think of it as a more refined and advanced take on the role radar has served in assistive technologies.

Radar in vehicles has been a cornerstone for autonomous driving for a few years. It has helped inform collision detection, lane keeping, and blind spot awareness. Plus, radar works with robust visual imaging suites and LiDAR for complete awareness of everything around the vehicle.

NHTSA

The National Highway Traffic Safety Administration is making real headway toward providing guidelines about what AD needs to be truly ready for America’s roads. The NHTSA has done quite a bit in standardizing automobile safety features over the past few years and made 2016-2025 safety feature stipulations for auto manufacturers. These recent additions are partially automated and very much in line with the aims and goals of autonomous driving. They also include items like lane-keeping assists, adaptive cruise control, and traffic jam assists. NHTSA has a stated goal for all new automobiles manufactured in the United States to have fully automated safety features from 2025 onward. With the headways made in the aforementioned systems, they very well may be on their way to ushering in autonomous driving across a wide swathe of vehicles.

Moving Forward With Autonomous Driving

Autonomous driving has progressed significantly toward providing standardized systems and guidelines for developing autonomous vehicles. As these vehicles – and their technology – mature, there will only be more robust frameworks and guidelines to bolster them.

Are you looking to integrate actionable experience towards developing your own autonomous driving systems? Digital Divide Data has the means and experience to develop robust systems adhering to the guidelines mentioned in this article. We offer support for a wide variety of visual imaging, object classification, and semantic segmentation. If you’re looking to bolster your AD platform, choose DDD to supply industry know-how for your data annotation.

Determining The New Gold Standard of Autonomous Driving Read Post »

DDD Webinar HeroImage Unretouched

Using Aerial Imagery as Training Data

DDD Webinar HeroImage Unretouched

By Aaron Bianchi
Aug 6, 2021

Numerous industries use satellite and aerial imagery to apply machine learning to business and social problem sets. This is a particular strength for DDD given our experience in geospatial and aerial use cases in insurance, transportation, meteorology, environmental protection, agriculture, law enforcement, national security, remote delivery, and traffic management.

This experience has taught us a great deal about the challenges and pitfalls associated with aerial image segmentation. We aired a webinar on this subject, and you can view the recording on-demand. Our goal is to deliver a hands-on guide to overcoming these challenges.

  1. Price of failure. Consider the cost of inadequately or incorrectly training an algorithm to evaluate geospatial or aerial images. Say the project is agricultural. Can you imagine the impact of incorrectly identifying crop disease or inadequate irrigation? Or say the project is military. What are the potential costs of misidentifying an elementary school as an army barracks? You need to include the expected cost of this kind of failure into your DIY vs outsource equation.

  2. Workforce. Aerial and geospatial images are often very large and very detailed, meaning that a large workforce of labelers is required to generate sufficient volumes of training data in a timely fashion. In our experience, most data science teams don’t have access to an in-house workforce big enough to meet their training data demands. This lack of a workforce is one of the principal drivers of seeking a training data partner.

  3. Data volumes. Keep in mind that you may be able to support in-house data preparation for an initial, simple use case, but in your quest for greater levels of model confidence, you will have to train your algorithm on additional use cases, and eventually edge cases. You may be able to generate enough data in-house to train an algorithm to land a delivery drone on a simple graphical market, but what does it take to distinguish between a leaf on the marker and a three-year-old child? Each additional use case requires at least as much training data as the first one, and rarely-occurring edge cases may require significantly more data. This dramatically compounds your workforce requirements, a discovery that many data science teams make late in their projects when budgets are dwindling and deadlines are imminent.

  4. Process and tools. Extremely high-resolution images are far too large to assign to a single labeler. But breaking up images, assigning them to multiple labelers, and then reassembling everything coherently introduces issues around worker consistency and process management. Do you have the wherewithal to train consistency into your own workforce? Do you have the technology and process required to track changes to very high numbers of image fragments? Most data science teams don’t.

  5. Specialization. Are you confident that you can define the most efficient tasks required to label your training data? We had a client who wanted us to label every individual tree in enormous hi-res forest images. As it happened, they weren’t interested in tree density; rather, they were trying to detect illegal land clearing. Because we have been preparing training data for decades, we were able to show them a different approach to labeling their images that appropriately trained their algorithm, but at a fraction of the time and cost of their approach.

  6. Focus. Preparing training data for aerial and geospatial systems involves the application of human judgment to nuanced, and sometimes hard to decipher, images. Our own data shows that the longer individuals spend on a particular kind of interpretation the faster and more accurate they do the work. Data science teams that crowdsource their aerial segmentation work do not capture these workforce efficiencies. DDD assigns you a team that stays with you throughout the span of your project, meaning that you capture all the benefits of growing worker efficiency and your effective cost per transaction steadily declines.

Using Aerial Imagery as Training Data Read Post »

ML Data Preparation Demands a Big Toolbox

By Aaron Bianchi
May 11, 2021

Part of the challenge of building machine learning models is that no two are the same. Train the same machine learning algorithm against different sets of data, and you end up with a different model.

If the quality of the raw data is high and the training data sampling is done well, the models shouldn’t vary a lot… but those are all big “if”s. Which is why data preprocessing, the actual data preparation process, is critically important.

A Forbes survey revealed that data scientists spend nearly 80% of their time on data prep, a quarter of that on data collection, and the other three quarters on data cleaning. Other survey results that indicate that real-world data science isn’t everything these practitioners thought it would be; clearly, data collection and data cleaning are not how they imagined they’d spend their working hours.

Data preparation is so time consuming because it is so important. The adage – or more appropriately in this setting, admonition – “Garbage In, Garbage Out” very much applies to data preparation for machine learning, which, in extreme cases, can involve the entire lifecycle of data collection to data cleaning and feature engineering. Missteps at any point in this process will result in low-confidence model predictions, or even a model that just misperforms.

Beyond their importance, training data sets for machine learning algorithms are also voluminous – many millions of data items in the case of complex problem spaces – and much of the data prep work demands human involvement (although much of this work is often repetitive and requires only contextual training to perform).

Finally, data preprocessing usually involves a variety of technologies, both for doing the actual work of preparing the data and for managing quality in the context of volume. If the problem space is simple – say, structured data with duplicates, null values and some lack of standardization – the technology needn’t be complex. But complex problem spaces – say, identifying and tracking video objects with complex taxonomies – can require specialized technology, much of it open source, with very particular feature sets.

In recent years, numerous solution providers have sprung up to fill the human and technological gaps that data scientists confront in preparing quality training data at scale. Some offload the human labor side of data prep. Others provide technology for cleaning and labeling training data sets. And yet, some provide both.

Data science teams would be well-advised to choose carefully when evaluating data preparation partners. DDD has “inherited” more than a few customers whose first vendor fell down, invariably on one or both of these two dimensions:

Something less than full lifecycle data prep. Recall that the Forbes survey indicated 60% of data scientists’ time is spent in data cleaning. Most of today’s data preparation vendors emphasize training data labeling and annotation. They presume that they will be given data that has already been cleaned.

If your data needs cleaning, i.e.,

  • it has not been de-duped

  • it is missing information

  • it is inconsistently presented from different sources

  • it requires entity resolution

  • it is image data of uneven quality or perspective

  • it is handwritten and requires transcription

If you don’t have data, or don’t have enough data, and need data collected or created, these vendors are not a good fit for you.

Reliance on a single technology platform. Every ML project is unique, with unique data. No single set of proprietary tools can possibly match up to every machine learning algorithm and training data set. Data science teams need to know that whoever is doing their data preparation is technology platform agnostic. They need to know that whoever is doing their data preparation has the flexibility and freedom to choose the best tool(s) for the project at hand and is not trying to shoehorn data into an inappropriate tool or trying to jerry-rig a third-party tool onto their platform for the sake of the team’s project.

ML Data Preparation Demands a Big Toolbox Read Post »

Scroll to Top