Open Data Distillation. Building a Deeper View of Impact for Policymakers and Evaluators.
An Increasingly Complex Research Agenda
The quest for knowledge is one of life’s constants, driven by a need to understand human behaviours, societal interactions and the consequences of collective decisions and investments. With this comes a desire to expand the frames of reference from which to gather information, and make more effective use of technologies that can discover, decipher and analyse data from a broader range of sources. In this context, the public sector is no different, considering the policy areas it interacts with. As a reflection of this, research commissioned by Government and public bodies is increasingly complex — eager to answer challenging questions about the economy and what drives its growth. As the primary policy area that explores and tracks the dynamics of companies, sectors and ecosystems, economic development is no different in this regard.
More specifically, there is a growing demand to gain deeper insight into key domains of interest, particularly in terms of understanding company and sector-level change, in response to public sector policy and investment.
There is also a desire to pay closer attention to a broader array of business performance and activity signals, to enable more proactive policy responses and to generate investment leads. In parallel, this has driven a need for evidence — that is data, to tell a more compelling and robust story, integrated with other research methods, that help to explain why changes and impacts are arising. To some extent, the push for richer and more expansive evidence has been propelled by inherent weaknesses of existing approaches to tracking company/sector change, and an over-reliance on singular data points. This includes leveraging official sources and well-versed methods that have value, but also inherent limitations, as a result of a lack of depth, detail and timeliness. The reality — there is an opportunity to take the data story further.
With rapid developments in technology, particularly large-scale storage of data on the web, greater open disclosure of company-level information, and computational resources, the exceptional processing power of AI can be harnessed. Here glass.ai has, and is, playing an important role, by providing access to data that tell a coherent story of impact and change. By crawling the open web and distilling vast amounts of information across millions of websites and data points, we are now able to generate a ‘package of indicators’ that offer a more comprehensive view of impact. In doing so, we have created a new and exciting standard for data tracking at a company level, which offers flexibility of focus, and breadth, across a variety of themes.
Use Cases — Data Distillation in Practice
glass.ai’s technology offers the ability to deep read open web data and collect information from a vast array of sources, aligned to bespoke signals or indicators. By grouping these around themes or areas of specific interest, linked to company, sector or place-based activity, more can be understood about the direction of travel, the pace of change and ultimately, the scale of impact. This means commissioners and researchers are armed with information that can probe and answer questions with a greater degree of robustness and fluency.
Perhaps the key benefit of reading information from the web and open sources is the ability to combine signals and develop a genuinely composite picture. This data, in isolation, becomes extremely powerful — when combined with other data collection methods, such as through primary research, the effect is truly compelling. Considering the data thematically, introduces the opportunity to observe change via the lens of topics that resonate strongly with economic development policymakers, such as:
Growth: showcasing how companies are expanding commercial capabilities and footprints.
Innovation: showcasing how companies are evolving their technologies, products and services.
Trade: showcasing how companies are engaging global markets through exporting and trade.
Decarbonisation: showcasing how companies are progressing towards Net Zero operation.
Network mapping: showcasing the connectedness of companies, sectors and assets that help to shape ecosystems.
There are several examples of us working with public sector partners to develop thematic indicators. Recent clients include the Oxford to Cambridge and Great South West pan-Regional Partnerships, Hampshire and Surrey County Councils (formerly Enterprise M3 Local Enterprise Partnership), the North East Combined Authority (formerly North East Local Enterprise Partnership) and Hounslow Borough Council. Here, within given themes, bespoke indicators have been used, to isolate the signals and behaviours that best reflect company/sector change, within the research context. The process has several clear advantages for the researcher and end-user:
Obtaining data that reflects the latest position and is more timely.
Balancing evidence from direct and indirect sources (i.e. company website, news, third-party datasets) to present a composite picture.
Gathering signals that highlight intent or a direction of travel, as well as an actual reported performance position.
Mitigating against data gaps, that may be present in the event that a singular indicator or source is used to report.
The ability to collect data in a repeatable fashion, at regular intervals.
Looking more specifically at thematic data use cases, an area that continues to be important in the context of economic development policy understanding the innovative capacity of businesses. Here glass.ai data can help build a composite picture of company and sector-level capability, by drawing data from several related signals. The infographic below shows a sample of the indicators that can be discovered through a curated web crawling process:
Another area that remains of significant interest to researchers policymakers and evaluators is that of decarbonisation and a shift towards ‘just transition’ to Net Zero. Here we can provide a fuller picture of activity, beyond more binary markers, such as the reporting of carbon emissions. Using the contextual reading capabilities of our AI, it’s possible to distil company and sector-level Net Zero activity and show a broader view of decarbonisation, by distilling data across a variety of sources:
In each of these use case examples, the glass.ai data offered further benefits for onward analysis:
Consistent presentation of data in a single source dataset, minimising the need for cross-referencing or matching with other sources.
Provenance of data provided, including links to web sources and textual snippets, to enable confident reporting.
No limitations imposed on data citation and use, including publication or sharing with external audiences.
Scalability offered by increasing the breadth of indicators to look at other signals of activity and change.
The examples set out above are not exhaustive, but have been shaped both by the demand of our clients, but also where have seen resonance with consistently important agendas. There are a number of other thematic areas that can be explored using a crawling approach. Indeed, there may also be an overlap between themes and individual indicators, where signals have multidimensional value. With this in mind, a broader array of indicators associated with growth could be incorporated within a different theme. Given we have been able to automate data collection across 14 deeper growth signals (as evidenced here), opportunities for indicator customisation and expansion are greater still.
Summary — An Increasing Appetite for Evidence
The quest for data to support innovative and complex public sector research is likely to accelerate. Ambitions are high, and the need for evidence to underpin significant public policy and investment decisions will surely grow too. In response, this suggests a role for new methods and cutting-edge technology. Indeed, recent trends imply a growing appetite from the UK Government for AI-powered solutions, which move beyond well-established but static databases. Here, the open web and gathering data obtainable by AI focused on understanding language at scale and in context will be an invaluable resource for researchers and policymakers. Critically, it will form part of a broader research arsenal, which effectively brings together other data points, and augments insights that can be gathered through primary research.
Looking ahead, glass.ai will be at the forefront of this progression, building bespoke datasets, that offer a blend of carefully curated indicators, to paint a compelling picture of impact. We look forward to working with partners and clients on such assignments, expanding into other areas of exploration, where web-based data can help respond to challenging questions within company, sector and place-based research.
Get in touch if you want to learn more about our thematic tracking capabilities: info@glass.ai.