CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Development Commands
Local Development
- Run development server:
bash run_server.sh
orbundle exec jekyll liveserve
- Opens local server at http://127.0.0.1:4000
- Includes live reload for automatic refresh when files change
Dependency Management
- Install dependencies:
bundle install
- Update dependencies:
bundle update
Google Scholar Citation Updates
- Manual trigger: Use GitHub Actions workflow “Update Google Scholar Citations”
- Automatic: Runs daily at midnight UTC via cron job
- Local testing:
cd google_scholar_crawler && python main.py
Architecture Overview
This is an academic personal homepage built on Jekyll with automated Google Scholar citation tracking.
Core Components
Jekyll Site Structure:
_config.yml
: Main Jekyll configuration with site metadata, author profile, and CDN settings_pages/about.md
: Main homepage content with publications and bio_includes/
: Reusable HTML components (analytics, author profile, etc.)_layouts/default.html
: Main page template_sass/
: SCSS stylesheets for custom styling
Google Scholar Integration:
google_scholar_crawler/main.py
: Python script that fetches citation data using the scholarly librarygoogle-scholar-stats/gs_data_shieldsio.json
: Citation data in shields.io format for badge display_includes/fetch_google_scholar_stats.html
: JavaScript that dynamically loads citation counts into the page.github/workflows/update-citations.yml
: GitHub Action that runs the crawler daily and commits updates
Key Architectural Patterns
Citation Display System:
- Uses
<span class='show_paper_citations' data='PAPER_ID'></span>
tags to mark where individual paper citations should appear - JavaScript fetches full citation data and populates these elements dynamically
- Total citations displayed via shields.io badge that pulls from the JSON file
CDN Strategy:
google_scholar_stats_use_cdn: true
in config enables CDN delivery via jsdelivr.net- Falls back to raw GitHub URLs when CDN is disabled
Error Handling:
- Citation crawler includes timeout protection (8 minutes) and retry logic with exponential backoff
- Graceful degradation when Google Scholar blocks requests - preserves existing citation data
File Organization
- Content:
_pages/about.md
contains all homepage content - Styling: Custom CSS in
assets/css/
and SCSS in_sass/
- Data: Citation data stored in
google-scholar-stats/
branch - Scripts: Python crawler in
google_scholar_crawler/
- Assets: Images, fonts, and JavaScript in
assets/
Important Configuration
Google Scholar ID: Set as GOOGLE_SCHOLAR_ID
environment variable (currently: EqemKYsAAAAJ)
Jekyll Exclusions: The build excludes documentation, crawler files, and development assets to keep the site clean.
Citation Badge: Displays real-time citation count from Google Scholar via automated daily updates.