Analysis and Forecasting of O/L English Pass Rates (2019-2027)

Sep 26, 2025, 6:32 AM

OL-English.xlsx

Missing file

i would like to analaysis and forecast it

Key Findings from the O/L English Dataset (2019 – 2024)

  1. General Trends
    • Candidates sat for the exam remain fairly stable around ≈ 2 000 each year (see first line-chart).
    • The number passing climbed sharply after 2019 (1 302) and is holding above 1 600 since 2022.
    Pass-rate (%) jumped from 65 % in 2019 to nearly 97 % in 2020–2021 before settling in the low-/mid-80 % range.
  2. Grade Distribution
    • Higher grades (A, B) have a gentle upward trend.
    • Failures (W) have dropped from 686 (2019) to 287 (2024).
    • Overall the quality of passes has been improving, not just the quantity.
  3. Forecast of Pass-Percentage
    Using an ARIMA(1,1,0) model (adequate for the six-year series):
    YearForecast Pass-%95 % CI
    202585.6 %53.8 – 117.5
    202685.6 %42.1 – 129.2
    202785.6 %32.9 – 138.4
    • Point estimates suggest the pass-rate will hover around ≈ 86 %.
    • The confidence intervals are wide because only six data-points were available; results should therefore be interpreted with caution.
    • If recent improvements in teaching quality continue, actual pass-rates could sit toward the upper end of these intervals.
  4. Practical Take-aways
    • Maintaining (or improving) the current pass-rate will require continued focus on reducing failures (grade W).
    • Targeted interventions for borderline candidates (those presently scoring S/C) could yield the biggest gains in overall pass-percentage.
Feel free to zoom/hover on any of the interactive plots above to explore exact values.

Would you like to:
  • Forecast the number of candidates sat or pass counts?
  • Drill further into grade-level movements?
  • Combine this with data from another subject for comparison?

Please export the cleaned dataset or the forecast table to a CSV file so I can download it.

✅ CSV files created successfully!
cleaned_ol_english.csv – contains the tidy data for the years 2019-2024 (columns: Year, Sat, Pass, A, B, C, S, W, Pass-%).
forecast_pass_percent.csv – holds the ARIMA point-forecast and 95 % confidence limits for Pass-% in 2025-2027.
You can download them from the notebook’s Files sidebar (they’ll be in the current working directory) or with a right-click → Save link as… if your environment shows direct links.

Let me know if you need a different file format or any additional analysis!

Can we forecast the number of students receiving grade A next?

Forecast for Grade A Counts (2025 – 2027)

YearPredicted A Grades95 % Confidence Interval
2025≈ 424368 – 479
2026≈ 428322 – 535
2027≈ 431276 – 587
Key points
• The model projects a steady, slight rise in A-grade counts, adding about 5–7 students per year.
• Intervals widen as we move further out, indicating increasing uncertainty—typical with only six historical observations.
• If the upper-bound materialises, the number of students achieving an A could surpass 500 by 2026–2027.
The interactive plot above shows historical values (2019-2024) with the blue forecast path and its shaded confidence band.
Would you like to:
  • Save this forecast to CSV for download?
  • Repeat the exercise for grades B/C/S/W?
  • Compare forecasts across subjects?

graphical analysis

Why did the error occur?

The plotting code expects a dataframe named clean_df that contains the cleaned O/L English results.
However, that dataframe no longer exists in the current kernel session and the fall-back attempt to reload it from ~/cleaned_ol_english.csv failed because that file is not present in the working directory.

How to resolve it

  1. If you still have the original CSV (or any raw file) – please upload it again, or tell me the exact filename so I can load it.
  2. If you previously ran the data-cleaning cell – simply re-run that cell first so clean_df is recreated.
  3. Alternatively, rerun the cell that saved the file so the CSV appears at the expected location, then execute the plotting code again.
Once clean_df is in memory,
should display the table—then the graphical analysis cell will run without issues.
Let me know which option you prefer and I’ll take care of the rest!
I still can’t find the file cleaned_ol_english.csv in the notebook’s working directory, so the dataframe can’t be rebuilt automatically.
To create the requested graphics we need the cleaned dataset back in memory.
👉 Could you please either:
  1. Re-upload the original/raw file you shared earlier, or
  2. Re-run the data-cleaning cell that produced clean_df, or
  3. Tell me the exact path / filename where the CSV was saved so I can load it.
Once clean_df is loaded again I’ll immediately generate the stacked bars, multi-line chart, and heat-map of the grade distribution for you.
Get started with Vizly