Compressed KV-Cache Layer for LLM Inference (Ongoing)
a compressed KV-cache layer for long-context LLM inference, integrating nano-vLLM with an LLM.265-inspired codec backend.
- Inference Engine
- KV-Cache Compression
a compressed KV-cache layer for long-context LLM inference, integrating nano-vLLM with an LLM.265-inspired codec backend.
A collaborative Markdown editor featuring GitLab OAuth, sharing/co-editing, and live preview/syntax highlighting.
An end-to-end fine-tuning & evaluation pipeline for gemma-3-270m, targeting abilities across factual QA, reasoning, and instruction following.
An iOS App for medical trainees and faculty of Duke Hospital to facilitate information entry, communication, evaluation, and scheduling.
A robust C compiler incorporating lexical analysis, syntax analysis, semantic analysis, IR generation, IR optimization, and code generation.
A cross-platform open-source forum App featuring post creation, comments, favorites, and private messaging.
A predictive analysis applying different models such as MLP, Random Forest, and XGBoost to predict NBA player scores using player statistics from the 2023 season.
An analysis program for large-scale website logs, extracting key website metrics and optimizing data processing efficiency.
A RISC-V-based computer system on an FPGA board, incorporating instruction processing, keyboard interfacing, and VGA display modules.