The difference between (raw) data and intel
Last month, Sam Holland attended Sustainability Live, an exciting conference that focused on eco-friendly concepts in modern technology. One key idea that stood out for Sam was the point that climate change research requires a lot more than data: it needs intel, too.
What is the difference between data and intel?
Put simply, data is the collection of facts and/or statistics that can be used for reference or analysis; intel is useful information for a specific discipline.
The above may sound like a mere nuance, but it is also a significant one. When it comes to the value of data (particularly raw data), consider the following cliche: ‘99% (or any other high percentage) of statistics are made up.’
This is a testament to the importance of context in any field of research, and climate change and its surrounding research is no different.
At the time of writing (early October), the live counter on the climate change research site, theworldcounts.com, says that there have been well over 33 billion tons of CO2 emitted into the atmosphere so far in 2022. Given that we have just begun the fourth quarter of 2022 at the time of writing, such a statistic may appear accurate (even if it’s just on an intuitive level). As the webpage’s source explains, “The world emits about 43 billion tons of CO2 a year (2019). [This is based on] total carbon emissions from all human activities, including agriculture and land use”.
Nevertheless, these references to the tens of billions of tons of carbon emissions being emitted throughout each year form unimaginably high numbers in our minds. But perhaps that is the problem: it is literally unimaginable. The scale of CO2 releases is simply too high to comprehend, and there is no context attached to what carbon emissions should mean as a metric for human-made impacts on Earth’s climate. As considered in the below section, this is one of the many reasons that the various stages of data analysis must be taken into account.
Ascertaining raw data, processed data, and intel
Earlier the definition of ‘data’ was put forward as ‘facts and statistics used for reference and analysis’. To put it more succinctly, however, there should also be an adjective introduced, because ‘data’ may otherwise be used synonymously with ‘raw data’.
The real term for such data, i.e. facts and statistics that can be used for analysis, is ‘processed data’. In other words, processed data is what the facts and statistics become after data scientists and other researchers have translated the numbers into a more readable form.
As an example of this process, consider the different terms in the context of traffic observation:
If a researcher videos a residential road for a full working day and finds that there were a total of 100 vehicles that drove through that road from 7am to 6pm, they are left with raw data.
However, if a data scientist were to then turn the raw data from their video recording into processed data, they would need to take this log of 100 vehicles and categorise them. The information would then no longer be contextless statistics, because the expert could ascertain how many vehicles were on the road within a variety of time windows. For the sake of argument, let’s say that the findings showed there were
40 cars on the road between 6am and 10am
10 cars on the road between 10am and 4pm
50 cars on the road between 4pm and 8pm
The processed data could then ascertain that, in a working day, the second to highest and most high number of cars can be observed in the early morning and the late afternoon. But there is still an absence of value (see data valuation). The question arises: what can we learn from the raw data-turned processed data?
Approaching questions like this is the key to avoid being bogged down in uncritical data (often formed of arbitrary statistics) and move on to the milestone in both quantitative and qualitative data: intel. The process of gathering such intelligence, particularly in this context of climate change and peak road traffic conditions, could be exemplified as such:
A recording of 100 vehicles on one road in one working day demonstrated that road-based air pollution correlates with the traffic demands of the working day (in other words, they hit their peaks during the various rush hours seen from Monday through to Friday).
The second highest percentage (40%) of vehicles are on the road before commuters’ working days start at 9am, and the top highest percentage (50%) of vehicles are on the road after that working day comes to an end.
This is in contrast to off-peak hours (between 10am and 4pm) which sees only 10% of the total vehicles on the road observed in one working day.
Conclusion: the observation of peak times of vehicle traffic, and therefore peak instances of greenhouse gas emissions (exhaust fumes) throughout each working day, can inform people’s understanding of the best times to combat roadway pollution during rush hour.
This intel (namely the conclusion taken from both the raw and processed traffic data) can be used as a means to convince homeowners how best to conserve energy in their homes. This is as the finding helps to explain why users of air purifiers and HVAC systems can save energy by activating their ventilation devices during rush hours – but not necessarily during the low-traffic times of the day. On a similar note, consider that urban homeowners (especially those with hay fever) are often told to avoid opening their windows during peak traffic times.
Considering the crucial nature of intel
This article’s consideration of small-scale roadway pollution has been used as a mere microcosm for peak pollution times throughout the days of the week, particularly the working days. It does however reflect that, without intel, data (especially statistics) can be used with political leanings and other biases to convince people of all manner of affairs, particularly in regards to the environment. This is a core reason why data, such as CO2 emission statistics, should always be read with the importance of intel in mind: data without intelligence, after all, is like a quotation without a source.