This is not a project, but rather a self-interest initiative to apply what I recently learned through a data science course and to explore new concepts. Through this small effort, I have gained hands-on experience with Jupyter notebooks, Python libraries, and various data visualization techniques.
- This notebook contains Exploratory Data Analysis (EDA) of the IPL 2023 season, analyzing team performances, player statistics, and match insights.
- Various Python libraries such as Pandas, Matplotlib, Seaborn, and NumPy have been used for data manipulation and visualization.
- The analysis includes team-wise and player-wise insights, match trends, scoring patterns, and more.
βοΈ How to work with Jupyter Notebooks
βοΈ Using Pandas for data manipulation
βοΈ Creating visualizations using Matplotlib and Seaborn
βοΈ Understanding different data types and structures
βοΈ Generating match insights through EDA
- This is not an accurate or professional analysis and may contain inaccuracies.
- Two matches were missing in the dataset, so some insights might be incomplete or incorrect.
- The data used here may not be 100% reliable, and the purpose was learning and experimentation rather than drawing final conclusions.
The dataset used in this analysis was sourced from Kaggle.
π Original Dataset Link: https://www.kaggle.com/datasets/sahiltailor/ipl-2024-ball-by-ball-dataset?select=ipl_2023_deliveries.csv
Before performing any analysis, the dataset was cleaned and preprocessed to ensure a smoother workflow:
βοΈ Handling Missing Values β Removed unnecessary columns and dealt with missing or inconsistent data.
βοΈ Filtering Relevant Data β Extracted key match details such as batting stats, wickets, extras, and over-wise progression.
βοΈ Standardizing Team & Player Names β Ensured uniformity in naming conventions for teams and players.
βοΈ Derived Columns β Created additional metrics for cumulative runs, strike rates, and match progression analysis.
πΉ Number of Matches per Season
πΉ Matches Played at Each Venue
πΉ Matches Played by Each Team
πΉ Team vs Team Runs Comparison (Heatmap)
πΉ Top Run Scorers & Top Wicket Takers
πΉ Most Sixes and Fours by Batsmen
πΉ Top Individual Match Scores
πΉ Total Runs Scored by Each Team
πΉ Wicket Types Distribution (Pie Chart)
πΉ Match 13 Worm Graph (Rinku Singhβs Last Over Heroics)
This exploration was not about building a perfect dataset but rather an effort to apply new learnings and understand the workflow of a data analysis project. I now have a better understanding of Python, data visualization, and working with real-world sports datasets.