A data visualizer tool produces the return period – loss graph of low-impact hazardous events for 91 countries. This tool use publicly available data from national disaster loss databases, but also available for other resources.
This tool is a part of a UCL IXN project: "Define return periods for low-impact hazardous events" with IFRC.
Define return periods for low-impact hazardous events
IFRC and individual Red Cross/Red Crescent National Societies are increasingly focusing on dedicating resources to take anticipatory action to mitigate the impacts of relatively high-frequency, low-intensity natural hazards. In other words, instead of reserving funds for those events that are expected to take place once every five years, IFRC and the National Societies aim to address events that occur more frequently, such as once per two or three years.
In order to assess when an event (whether forecasted or already occurred) has reached the new threshold, we need to have impact exceedance curves based on observational data for these specific types of hazardous events. Using publicly available data from national disaster loss databases (DesInventar), EM-DAT, IFRC and other sources, the goal of this project would be to create exceedance curves and tables for multiple impacts for as many National Societies as possible.
Check out our online demo.
git clone https://github.com/COMP0016-IFRC-Team5/data-visualiser
cd data-visualiser
conda env create -f conda_env.yml
conda activate data-visualiser
- Using pip (Python 3.10+)
pip install -r requirements.txt
- Get data using data-downloader module
- Process data using data-processor module
The example shows a typical case which produce the return period - deaths & affected people graphs for floods and earthquakes in Albania and Pakistan. Data used from past 15 years.
python example.py
A typical process could be done in 3 steps:
- set data folder path
- plot graph(s)
- get table(s)
To use default processed data:
visualiser.set_data_folder('./data')
Then you can get the available countries for analysis by
calling visualiser.get_available_countries()
after setting the data folder.
print(visualiser.get_available_countries())
API for plot exceedance curves:
visualiser.plot_exceedance_curves(
countries,
events,
losses,
years_required
)
Args:
- countries: A string or list of strings specifying the countries.
- events: A string or list of strings specifying the events.
- losses: A Loss enum or list of Loss enums specifying the losses.
- years_required: An int specifying the maximum number of years of data required. Default is -1.
The tool also provide a function to extract key return period for all metrics
defined and organized as a table. The table can be easily accessed by calling
visualiser.get_exceedance_table()
:
tables = visualiser.get_exceedance_table(
countries,
events,
years_required
)
Currently, we only defined deaths and affected people (directly affected +
indirectly affected). If you want to add more metrics, you can modify it at
visualiser/_models/_loss.py
.
In visualiser/_config.py
, you can modify __SELECTED_FOLDER
to the folder
that you want to conduct analysis.
You can find relevant code in __add_label()
method and __highlight()
method
for Plotter
class
If you want to use another data source, you need to put the data source under
the data
directory and ensure the folder structure is:
data-visualiser/
├─ data/
│ ├─ new_data_source/
│ │ ├─ country_name/
│ │ │ ├─ EARTHQUAKES.csv
│ │ │ ├─ FLOODS.csv
│ │ │ ├─ STORMS.csv
For each csv file, the data should be parsed to contain these columns: deaths
,
directly_affected
, indirectly_affected
, start_date
, and secondary_end
.
For example:
deaths | directly_affected | indirectly_affected | start_date | secondary_end |
---|---|---|---|---|
0 | 100 | 200 | 1911-02-18 | 1911-02-21 |
5 | 60 | 300 | 1912-02-18 | 1912-02-21 |
3 | 100 | 100 | 1914-02-18 | 1914-02-21 |
10 | 220 | 400 | 1916-02-18 | 1916-02-21 |
Next, you need to add a member in visualiser/_adapters/_folders.py
with value
being the name of the data source folder.
Then, you need to modify __SELECTED_FOLDER
in _config.py
.
Note: you need to ignore or remove the labels after plot the curves if you are working with new data sources.
- Dekun Zhang @DekunZhang
- Hardik Agrawal @Hardik2239
- Yuhang Zhou @1756413059
- Jucheng Hu @smgjch