-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Preprocessing and Cleanup for SBIN Historical Stock Data #96
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure the PR matches the requirements mentioned in the Contribution guide. The maintainer might get in touch to enusre quality. Thanks for your time
This pull request includes the exploratory data analysis (EDA) performed on the cleaned SBIN historical stock dataset. The following tasks were completed as part of the EDA: Descriptive Statistics: Summary statistics for key numerical columns such as Open, Close, High, Low, Volume, Price Range, and Daily Return. Trend Visualization: A time series plot was created to show the trend of the stock’s closing price over time. Correlation Analysis: A heatmap was generated to identify correlations between key stock variables (Open, Close, High, Low, Volume, etc.). Volatility Analysis: Histograms were plotted to show the distribution of Price Range and Daily Return, giving insights into stock price volatility
@Anjankumar26 Please link the issue using keywords. (fixes/closes..) You can find the detailed info about the same here: Thanks & regards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution and effort on this PR. As there hasn't been any activity or updates for a while, we will go ahead and close this PR to keep the repository organized. If you’d like to revisit or continue working on this, please feel free to reopen it or submit a new PR.
Thanks again for your work!
Regards,
Mayuresh
fixes #13
SBIN_cleaned.csv
This pull request addresses the preprocessing and cleanup of the SBIN historical stock dataset. The following steps were completed:
Date Column Formatting: Converted the 'Date' column from string format to a datetime object for easier analysis and consistency.
Missing Data Handling: Removed rows containing missing values across multiple columns including 'Open', 'Close', 'High', 'Low', 'Adj Close', and 'Volume'.
Feature Engineering: Added new columns for enhanced data analysis:
Price Range: Difference between the daily 'High' and 'Low' prices.
Daily Return: Percentage change between the 'Open' and 'Close' prices.
Descriptive Statistics: Basic statistical summary was generated to better understand the data distribution after cleanup.
The cleaned dataset is now ready for further analysis, including trend visualization, correlation studies, and volatility analysis.
Changes Made:
Added SBIN_cleaned.csv which contains the preprocessed data.
Removed missing data and ensured the dataset is suitable for analytical tasks.
Impact:
This PR ensures that the dataset is clean, properly formatted, and enriched with additional features, which will improve the accuracy and effectiveness of any future analysis or modeling.
Fixes #13
This pull request includes the exploratory data analysis (EDA) performed on the cleaned SBIN historical stock dataset. The following tasks were completed as part of the EDA:
Descriptive Statistics: Summary statistics for key numerical columns such as Open, Close, High, Low, Volume, Price Range, and Daily Return.
Trend Visualization: A time series plot was created to show the trend of the stock’s closing price over time.
Correlation Analysis: A heatmap was generated to identify correlations between key stock variables (Open, Close, High, Low, Volume, etc.).
Volatility Analysis: Histograms were plotted to show the distribution of Price Range and Daily Return, giving insights into stock price volatility.
Changes Made:
Added EDA_SBIN_clean-checkpoint.ipynb notebook that contains the full EDA process, along with visualizations.
The analysis highlights important trends and patterns in the SBIN dataset.
Impact:
This PR provides insights into the behavior and volatility of SBIN stock, laying the groundwork for further predictive modeling or advanced analysis.