This project analyzes population growth data from 1950 to 2023 using Python libraries such as Pandas, NumPy, Matplotlib, and Plotly.
The dataset used in this analysis is stored in a CSV file named 'Population Growth.csv'. It contains the following columns:
- Year: The year of the population data.
- Population Growth Rate: The population growth rate for each year.
- Growth Rate: The growth rate percentage for each year.
To run this project, you need to have the following Python libraries installed:
- pandas
- numpy
- matplotlib
- plotly
You can install these libraries using pip:
pip install pandas numpy matplotlib plotly
- Make sure you have the required dependencies installed.
- Place the 'Population Growth.csv' file in the same directory as the Python script.
- Run the Python script.
The script performs the following data analysis tasks:
- Reads the CSV file into a Pandas DataFrame.
- Displays the first few rows of the DataFrame using
head()
. - Prints information about the DataFrame using
info()
. - Describes the statistical summary of the DataFrame using
describe()
. - Extracts the 'Year', 'Population Growth Rate', and 'Growth Rate' columns from the DataFrame.
- Converts the 'Population Growth Rate' column to numeric values.
- Finds the minimum value, maximum value, and the number of data points in the 'Growth Rate' column.
- Creates a line plot using Plotly to visualize the population growth rate over the years.
- Creates another line plot using Plotly to visualize the growth rate percentage over the years.
- Creates a scatter plot using Matplotlib to show the relationship between population and growth rate.
- Customizes the x-axis ticks of the scatter plot.
The script generates the following visualizations:
- A line plot showing the population growth rate from 1950 to 2023.
- A line plot showing the growth rate percentage from 1950 to 2023.
- A scatter plot displaying the relationship between population and growth rate.
The script also prints the following information:
- The first few rows of the DataFrame.
- Information about the DataFrame, including column names, data types, and memory usage.
- Statistical summary of the DataFrame.
- Minimum value, maximum value, and the number of data points in the 'Growth Rate' column.
If you would like to contribute to this project, please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them with descriptive commit messages.
- Push your changes to your forked repository.
- Submit a pull request to the main repository.
This project is licensed under the MIT License.
Feel free to use and modify the code as per your requirements.
- Thanks to the developers of the Pandas, NumPy, Matplotlib, and Plotly libraries for their valuable contributions to the Python data analysis ecosystem.