Skip to content

EdwardLee798/7108project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

COMP7108 Project: Graph Databases for Financial Intelligence

This repository contains our COMP7108 project implementation using the PaySim transaction dataset and Neo4j/Cypher for fraud-oriented graph analytics.

1) Project Scope

The project covers the assignment pipeline end-to-end:

  • Property-graph modeling and ingestion of PaySim data
  • Fraud pattern discovery with graph traversal queries
  • Fraud-subgraph visualization
  • Graph algorithm analysis (GDS)
  • Temporal fraud behavior analysis

2) Dataset

Note: the raw CSV is not included in this repository. Please download it and place it in Neo4j's import-accessible location.

3) Repository Structure

  • Project/taskA: ingestion, schema reset, constraints/indexes, validation
  • Project/taskB: pattern queries (fan-in, fan-out, mule, cycle attempts, shortest path)
  • Project/taskC: fraud subgraph visualization query + exported figures
  • Project/taskD: graph algorithm queries (PageRank, community, betweenness)
  • Project/taskE: temporal-analysis outputs and Python post-processing scripts
  • scripts: assignment brief and project guidance documents

4) Environment Requirements

  • Neo4j (for Cypher execution)
  • Neo4j Graph Data Science plugin (required for Task D queries)
  • Python 3.9+ (for Task E scripts)
  • Python packages:
    • pandas
    • matplotlib

5) How to Run

Step A. Prepare data for import

  1. Download PaySim CSV from Kaggle.
  2. Rename/place the file so that Cypher import path in Project/taskA/taskA03_import.cypher is valid (file:///data.csv by default).

Step B. Run Task A (in Neo4j Browser, in order)

  1. Project/taskA/taskA01_reset_and_schema.cypher
  2. Project/taskA/taskA02_constraints_and_indexes.cypher
  3. Project/taskA/taskA03_import.cypher
  4. Project/taskA/taskA04_validation_queries.cypher

Step C. Run Task B/C/D queries

Step D. Run Task E analysis

  1. (E01) Run Project/taskE/taskE01_transaction_spike.cypher and export the results as taskE01_outgoing.csv and taskE01_incoming.csv.
  2. (E01) Edit paths in Project/taskE/taskE01_transaction_spike.py to your local files, then run the script to generate spike results.
  3. (E02) Run Project/taskE/taskE02_balance_anomaly.cypher and export anomaly records.
  4. (E03) Run Project/taskE/taskE03_fraud_timeline.cypher and export timeline CSVs.
  5. (E03) For fraud timeline plotting, use Project/taskE/taskE03_fraud_timeline_plot.py with Project/taskE/output/taskE03_fraud_timeline.csv.

Important: current Python scripts contain Windows absolute paths in the source code; update them before execution.

6) Mapping to Assignment Requirements (Task A–E)

Task A: Data ingestion and optimization

Task B: Pattern discovery

Task C: Visualization

Task D: Graph algorithm application

Task E: Temporal analysis

7) Deliverables in This Repository

Aligned with assignment deliverables:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors