[{"content":"My 4.5-year tenure at 6sense was a period of massive technical evolution. I joined shortly after the Covid lockdowns and went from building single-page applications in \u0026ldquo;hackathon mode\u0026rdquo; to architecting AI systems that process millions of signals daily.\nHere is a deep dive into the systems I built, the architectural decisions my team made, and the challenges we navigated along the way.\n1. The Foundation: Sales Dashboards (2020 - 2025) My first major project was building the \u0026ldquo;Sales Dashboards\u0026rdquo; app from scratch. This was the starting point for the sales team to get insights and recommendations for their accounts.\nThe Build: I was the sole frontend developer responsible for the initial architecture, boilerplate code, CI/CD pipelines, and the foundational reusable component library. The Stack: We went with JavaScript + React + GraphQL + Apollo on the frontend, backed by Django. The Grind: We spent 3 months in pure hackathon mode, pushing patches until the very last minute before our November 2020 launch. I maintained this frontend for its entire lifecycle until it was officially decommissioned in March 2025 (RIP 🪦). 2. The Slintel Merger: Architecting the New Sales Intelligence When 6sense acquired Slintel, we faced a massive integration challenge. We needed to merge legacy 6sense Sales Intelligence with Slintel\u0026rsquo;s technology.\nInitially, we considered the easy route—just embedding one app\u0026rsquo;s frontend as an iframe within the other. Instead, as part of a 3-member architect team, we decided to execute a total UI overhaul and introduce a completely new design language.\nThe Migration: We migrated from JavaScript to TypeScript, utilized the Craco framework, and created new boilerplates to connect with the backend. The Execution: It took 5 months, 15 engineers, and a lot of late nights, but the unified Sales Intelligence app was a massive success, eventually contributing over $20M+ in revenue. 3. Scaling Big Data \u0026amp; The Outlook Rendering Nightmare As the platform grew, my responsibilities expanded into the backend, specifically focusing on data pipelines and automated alerting through SI Alerts.\nWe sent over 90,000 custom intent alerts daily, notifying users of critical signals for their accounts. This project presented two entirely different types of engineering challenges:\nThe Frontend (Email Clients): Writing the frontend for this was peculiar because these alerts don\u0026rsquo;t load in browsers; they load in native email clients. Microsoft Outlook uses the MS Word engine to render HTML. Building responsive templates using \u0026ldquo;ancient\u0026rdquo; table-based CSS to satisfy Outlook was an absolute trial by fire. The Backend (PySpark): The backend was my introduction to big data. We used Python, PySpark, and Jinja2 to generate the templates. Later, as part of the 6-member SI Tech Pod, I took on ownership of keeping these massive pipelines running—handling PagerDuty, optimizing slow queries, managing Hive and Hadoop clusters, and successfully optimizing a critical DAG that reduced our daily run time by over 45 minutes.\n4. The LLM Era: AI Email Writer \u0026amp; Account Summaries In my final years at 6sense, my focus shifted entirely to Generative AI product engineering. I was tasked with building systems that didn\u0026rsquo;t just display data, but acted on it.\nAI Account Summary: I built an engine that generated 20,000+ summaries daily, taking raw activity recaps, buying stage predictions, and intent insights, and turning them into actionable persona mappings. To make this performant, I implemented heavy response caching and async processing, which drastically reduced redundant LLM calls and latency. AI Email Writer: This was the culmination of our intent data. I built an LLM system that processed 1,500+ emails daily, weaving together CRM history, technographics, and behavioral signals. Because of the risks associated with AI-generated outbound emails, I engineered multi-layer guardrails and feedback loops to ensure output safety, alongside a credit-based usage system with org-level quotas. The impact was undeniable: a 42% reply rate and a 29% increase in booked meetings. ","permalink":"https://w3shubh.com/posts/pandora/","summary":"A 4.5-year journey scaling frontend architectures, merging acquisitions, building big data pipelines, and shipping Generative AI infrastructure.","title":"6sense: From Hackathons to AI Pipelines"},{"content":"When 6sense decided to turn its massive intent data into actionable outbound emails, I was tasked with building the AI Email Writer from the ground up. This wasn\u0026rsquo;t just another GPT wrapper — it was a production system that needed to be accurate, safe, and scalable.\nThe Problem Sales teams spend hours crafting personalized outbound emails. Even with intent data telling them who to contact and why, the actual email composition remained a manual bottleneck. We needed a system that could:\nIngest multiple data signals (CRM history, technographics, behavioral data, intent signals) Generate personalized, context-aware emails Guarantee safety — no hallucinations, no inappropriate content in customer-facing communications Scale to thousands of emails daily across hundreds of organizations Architecture The system was built as a Django-based pipeline with OpenAI\u0026rsquo;s API at its core:\nData Aggregation Layer: Pull together intent signals, technographic data, CRM interaction history, and behavioral signals for each target account Prompt Engineering: Dynamic prompt construction that weaves together all data signals into a coherent context window LLM Generation: OpenAI API calls with carefully tuned parameters for tone, length, and style Multi-Layer Guardrails: Three concurrent validation checks run on every generated email Feedback Loops: Failed validations trigger automatic retries with adjusted prompts The Guardrails System This was the most critical engineering challenge. AI-generated outbound emails carry significant brand risk. I engineered a three-layer validation system:\nSafety Check: Screens for inappropriate content, competitor mentions, and policy violations Hallucination Detection: Cross-references generated claims against actual account data — if the email mentions a metric or fact, it must exist in the source data Framework Check: Validates email structure, call-to-action presence, and adherence to the selected email framework (AIDA, PAS, BAB, etc.) All three checks run concurrently for performance. If any check fails, the system triggers a feedback-loop retry with the failure reason injected into the prompt.\nCredit-Based Usage System To manage costs and prevent abuse, I built an org-level quota system:\nOrganization Quotas: Each org gets allocated credits based on their subscription tier Regeneration Limits: Users can regenerate emails, but each regeneration consumes credits Real-Time Tracking: Live dashboards showing credit usage, remaining balance, and consumption trends Graceful Degradation: When credits run low, the system alerts admins before hitting hard limits Impact The results exceeded expectations:\nMetric Result Daily Volume 1,500+ emails generated Reply Rate 42% (vs. ~15% industry average) Booked Meetings 29% increase from outbound campaigns Guardrail Pass Rate 94% on first attempt Tech Stack Python · Django · OpenAI API · Celery · Redis · PostgreSQL\nThis project represents the intersection of AI engineering and product thinking — building systems that are not just technically impressive but deliver measurable business outcomes.\n","permalink":"https://w3shubh.com/posts/ai-email-writer/","summary":"Built an LLM email generation system processing 1,500+ emails daily with multi-layer guardrails, achieving 42% reply rates.","title":"AI Email Writer — LLM-Powered Outbound at Scale"},{"content":"The AI Account Summary engine was one of the highest-impact projects I worked on at 6sense. It transformed raw data lakes of account activity into concise, actionable intelligence that sales reps could use instantly.\nThe Problem Sales reps at 6sense\u0026rsquo;s enterprise customers were drowning in data. Each account had dozens of signals — web visits, content downloads, keyword research trends, technographic changes, and more. Reps needed to sift through all of this before every call or meeting.\nWe needed a system that could:\nDistill hundreds of data points into a single, readable summary Predict buying stages and recommend next actions Map personas within target accounts Scale to 20,000+ summaries per day without breaking the bank on LLM costs Architecture Data Pipeline The system ingested multiple data streams:\nActivity Recaps: Recent web visits, content engagement, and ad interactions Intent Signals: Keyword-level research activity mapped to buying intent Buying Stage Predictions: ML-model outputs indicating where the account sits in the purchase journey Technographic Data: Known tech stack, recent changes, and competitive displacement signals Summary Generation Each summary was structured into actionable sections:\nExecutive Overview — What happened with this account recently? Intent Insights — What topics are they researching? Buying Stage — Where are they in the journey? Persona Mapping — Who are the key decision-makers engaging? Recommended Next Actions — What should the rep do next? Performance Optimization Generating 20K+ summaries daily with LLM calls is expensive. Two key optimizations made this feasible:\nResponse Caching: Identical or near-identical data inputs produce cached summaries. We implemented a smart hashing mechanism that creates cache keys from the data fingerprint, so accounts with unchanged signals serve cached responses Async Processing: Summary generation runs asynchronously via task queues. Reps see a loading state briefly, then the summary renders — but the LLM isn\u0026rsquo;t blocking any request threads These optimizations reduced redundant LLM calls by ~60% and brought average latency down significantly.\nFrontend The summary UI was built in React + TypeScript, designed for scannability:\nCollapsible sections for each summary component Visual indicators for intent signal strength Inline actions (schedule follow-up, add to cadence, share with team) Responsive design for both desktop dashboards and mobile views Impact Metric Result Daily Volume 20,000+ summaries Pipeline Impact 11% increase in Stage 0 pipeline LLM Cost Reduction ~60% via caching Data Sources 5+ integrated signal types Tech Stack Python · Django · OpenAI API · React · TypeScript · Celery · Redis · PostgreSQL\nThis project showed me that the real challenge in AI engineering isn\u0026rsquo;t the model — it\u0026rsquo;s the infrastructure around it. Caching, async processing, and data pipeline design determine whether an AI feature is a demo or a product.\n","permalink":"https://w3shubh.com/posts/ai-account-summary/","summary":"Built an AI engine generating 20,000+ account summaries daily with buying stage predictions, intent insights, and next-action recommendations.","title":"AI Account Summary — 20K+ Daily AI Summaries"},{"content":"OrangeSomething.com was a Direct-to-Consumer (D2C) platform for affordable beauty and personal care products, incubated by Blinkit (formerly Grofers). As the Founding Engineer, I owned the entire technical stack.\nThe Shopify → React Migration When I joined, the platform ran on Shopify with an average page load time of 12 seconds. For a D2C brand competing for attention, this was unacceptable.\nI led the migration from Shopify to a ReactJS Single-Page Application (SPA):\nRedesigned the entire frontend architecture Implemented code splitting and lazy loading Optimized asset delivery and image compression Reduced average load time from 12 seconds to under 3 seconds — a 4x improvement The result was a dramatically better user experience, lower bounce rates, and improved conversion metrics.\nAndroid App Development Beyond the web platform, I designed and developed a native Android app using Kotlin:\nBuilt from scratch with modern Android architecture (MVVM) Integrated with the same backend APIs as the web platform Improved customer retention and increased average order value (AOV) Higher conversion rates compared to mobile web Mentorship I also mentored a junior developer, helping them ramp up on both the React codebase and general software development practices.\nTechnologies Used ReactJS · JavaScript · Kotlin · Liquid (Shopify) · HTML · CSS\nOrangeSomething shut down its operations during the Covid lockdowns, but the experience of building a product from zero as a founding engineer was invaluable.\n","permalink":"https://w3shubh.com/posts/orangesomething/","summary":"Led the full tech stack for a D2C beauty platform incubated at Blinkit — migrated from Shopify to React (12s → 3s load) and built a native Android app.","title":"OrangeSomething — Founding Engineer at a Blinkit-Incubated D2C Brand"},{"content":"As 6sense grew, the volume of intent data, web activities, and technographic signals we processed exploded. The system that built and delivered these insights—specifically the daily SI Alerts pipeline—became my introduction to big data engineering.\nThe Challenge Every day, we generated over 90,000 custom alert emails for customers based on millions of underlying data points. This pipeline was critical; many users literally started their work week by reading these insights.\nWhen I joined the 6-member SI Tech Pod, our data infrastructure was groaning under the scale:\nLong-running pipelines frequently missed SLAs Slow, inefficient queries in Hive Legacy databases driving high operational costs Complex PySpark DAGs that were difficult to debug Pipeline Optimization One of my major wins was diving into a critical data pipeline DAG that had become a bottleneck for our daily runs.\nQuery Analysis: Identified inefficient joins and missing partitions in our Hive queries PySpark Tuning: Optimized the Spark execution plan by managing data skew and tuning shuffle partitions Caching: Strategically persisted intermediate DataFrames that were reused across multiple downstream jobs The Impact: These optimizations reduced the daily run time of our core pipeline by over 45 minutes, ensuring we consistently hit our delivery SLAs.\nDecommissioning Legacy Infrastructure Technical debt is an invisible cost until it isn\u0026rsquo;t. As we migrated features to the newer, unified Sales Intelligence platform, we were left with the legacy \u0026ldquo;Sales Dashboard\u0026rdquo; application and its underlying database.\nI led the effort to safely decommission this infrastructure. This wasn\u0026rsquo;t just pulling a plug; it required:\nMigrating any remaining active workloads Ensuring downstream consumers were repointed Safely backing up archived data The Impact: Retiring this legacy database resulted in over $130,000 in annualized infrastructure savings.\nTech Stack PySpark · Hadoop · Hive · Python · Airflow (DAGs) · AWS\nMoving from frontend engineering to big data pipelines was a trial by fire, but it gave me a deep appreciation for optimization. When your datasets are measured in terabytes, even a small inefficiency cascades into massive performance hits.\n","permalink":"https://w3shubh.com/posts/big-data-pipelines/","summary":"Owned critical big data infrastructure handling millions of signals daily. Optimized a core PySpark DAG saving 45 mins/day and retired legacy systems to save $130K annually.","title":"Big Data Engineering: Optimizing PySpark Pipelines at 6sense"},{"content":"Ape Unit GmbH is a Berlin-based digital studio known for its innovative work in blockchain, decentralization, and climate-tech. Working here was my first international experience and a formative chapter in my career.\nThe Blockchain Explorer My primary project was building the Aeternity Blockchain Explorer — a sophisticated web application that lets users explore the Aeternity blockchain in real time. The explorer allows users to:\nBrowse transactions, mined blocks, and account balances Search for specific transactions, blocks, or addresses View network statistics and chain health Inspect smart contract deployments and interactions 🔗 Links:\nBlockchain Explorer (Web Archive) GitHub Repository Technical Decisions We built the explorer using Vue.js with a component-based architecture. The key technical challenges included:\nReal-time data: WebSocket connections to blockchain nodes for live block/transaction updates Search performance: Implementing efficient search across millions of blockchain records Data visualization: Clean, intuitive displays for complex blockchain data structures Building a Team in India Beyond coding, I took on a leadership role by establishing and managing a new tech team of four developers in India. This involved:\nRecruiting and onboarding developers remotely Setting up Agile workflows (sprints, standups, retrospectives) Coordinating across time zones between Berlin and India Overseeing the complete product development lifecycle Technologies Used Vue.js · JavaScript · SCSS · HTML · WebSockets · REST APIs\nThis role taught me that building software across continents is as much about communication and process as it is about code.\n","permalink":"https://w3shubh.com/posts/apeunit/","summary":"Led development of a blockchain explorer for the Aeternity network and managed a new tech team of four developers across India and Berlin.","title":"Ape Unit — Blockchain Explorer \u0026 Team Building in Berlin"},{"content":"While prompt engineering can solve many problems in Generative AI, there\u0026rsquo;s a hard limit to what you can achieve when an LLM needs access to massive, ever-changing proprietary datasets. This is where Retrieval-Augmented Generation (RAG) becomes essential.\nBeyond Context Windows In my work building AI systems, we frequently hit the limitations of standard API calls:\nContext Limits: We couldn\u0026rsquo;t fit an organization\u0026rsquo;s entire CRM history or knowledge base into a single prompt. Hallucinations: When LLMs don\u0026rsquo;t know the answer, they make one up. In enterprise B2B sales contexts, a hallucinated metric or name can ruin trust. Stale Data: Models are trained on past data, but intent signals change daily. The RAG Architecture To solve this, I\u0026rsquo;ve worked extensively with RAG architectures using Vector Databases.\nThe workflow:\nIngestion \u0026amp; Embedding: Take massive text datasets (contact histories, intent signals, company documentation) and convert them into dense vector embeddings using models like OpenAI\u0026rsquo;s text-embedding-ada-002. Vector Storage: Store these high-dimensional vectors in specialized Vector Databases optimized for rapid similarity search. Semantic Retrieval: When an AI prompt is generated (e.g., \u0026ldquo;Write an email referencing our past interactions\u0026rdquo;), the system embeds the query and searches the Vector DB for the nearest mathematical neighbors. Augmented Generation: The relevant retrieved documents are injected into the prompt context, forcing the LLM to base its answer only on the provided facts. Engineering Challenges Working with RAG isn\u0026rsquo;t just plugging APIs together; it requires solving hard engineering problems:\nChunking Strategies: Deciding how to break down documents. Too small, and you lose semantic context. Too large, and you dilute the embedding\u0026rsquo;s focus. Hybrid Search: Relying purely on semantic vector search isn\u0026rsquo;t enough. We often need to combine vector similarity with traditional keyword search (BM25) and metadata filtering to get the most relevant results. Latency: Adding a vector search step before the LLM inference step introduces latency. Optimizing the database and caching frequent queries is critical for user experience. Tech Stack Python · OpenAI API · Vector Databases · LangChain / LlamaIndex · Semantic Search\nRAG is the bridge between the reasoning engine of an LLM and the specific, factual reality of an enterprise\u0026rsquo;s data.\n","permalink":"https://w3shubh.com/posts/rag-vector-databases/","summary":"Designing advanced LLM architectures using Retrieval-Augmented Generation to ground AI responses in proprietary data.","title":"Building with RAG and Vector Databases"},{"content":"TransformNow is a platform dedicated to promoting healthy eating by offering personalized diet plans, ingredient recommendations, and wellness guidance.\nWhat I Built I designed and developed the entire application as a Progressive Web App (PWA) from scratch:\n30+ interactive pages covering diet planning, meal tracking, ingredient databases, and user dashboards Offline-first architecture — the app works seamlessly without an internet connection using Service Workers Installable on mobile — users can add the app to their home screen for a native-like experience Optimized for performance — lazy loading, code splitting, and image optimization throughout Lighthouse Scores The app achieved 90+ scores across all four Lighthouse categories:\nCategory Score Performance 90+ Accessibility 90+ Best Practices 90+ SEO 90+ Technologies Used ReactJS · JavaScript · HTML · CSS · Service Workers · Workbox\n","permalink":"https://w3shubh.com/posts/transformnow/","summary":"Designed and developed a Progressive Web App with 30+ pages scoring 90+ on Lighthouse for performance, accessibility, and SEO.","title":"TransformNow — A PWA for Healthy Living"},{"content":"During my time at IIT Roorkee, long before Web3 became a saturated buzzword, my team built LifeBlocks — a trustless certificate verification system leveraging the Ethereum blockchain.\n🔗 Links:\nDevpost Submission GitHub Repository The Problem Verifying academic degrees, bootcamp certificates, and employment history is a slow, manual process. Employers must contact issuing institutions, who then manually verify their records. Digital PDFs are easily forged.\nWe wanted to create a system where a certificate is issued once and can be mathematically verified instantly by anyone, without trusting a central database.\nArchitecture I worked as the full-stack developer on the project, spanning both the decentralized backend and the user-facing web app.\nThe Smart Contract (Solidity) We wrote Ethereum smart contracts in Solidity to serve as the immutable ledger.\nInstitutions register on the platform and are granted issuing rights. When a certificate is issued, a cryptographic hash of the certificate data is stored on the Ethereum blockchain, tied to the issuer\u0026rsquo;s address and the recipient. The actual private data remains off-chain, while the verifiable proof lives on-chain. The Frontend (React) A blockchain backend is useless without an accessible interface. I built the frontend using ReactJS and Web3.js:\nIntegrated MetaMask for user authentication and transaction signing. Built dashboards for institutions to issue certificates in bulk. Created a simple verification public portal: an employer drops a certificate file or inputs an ID, the app hashes the data, checks it against the smart contract, and instantly returns a \u0026ldquo;Verified\u0026rdquo; or \u0026ldquo;Invalid\u0026rdquo; status. Tech Stack Ethereum · Solidity · ReactJS · JavaScript · Web3.js\nThis project was my deep-dive into distributed systems and the concept of trustless architecture.\n","permalink":"https://w3shubh.com/posts/lifeblocks-web3/","summary":"Developed a blockchain system using Solidity and React enabling institutions to issue verifiable digital certificates.","title":"LifeBlocks: Trustless Certificate Verification on Ethereum"},{"content":"At Transcend Labs, I worked on two distinct full-stack projects that combined web development with marketplace integrations.\nInventory Management System Built an inventory management solution for online marketplaces. The system automated previously manual workflows:\nIntegrated with marketplace APIs (eBay, Amazon, Discogs) for real-time inventory sync Automated listing creation, price updates, and stock level management Python scripts for data extraction, transformation, and bulk operations Repricer — Price Monitoring \u0026amp; Alerts Developed a web application called Repricer that tracked prices across online marketplaces:\nReal-time monitoring of item availability and price changes on eBay, Amazon, and Discogs SMS notifications alerting users when items hit their target price or become available in desired condition Dashboard for managing watchlists, setting price rules, and viewing historical price trends Technologies Used Python · Flask · ReactJS · MySQL · HTML · CSS · REST APIs\n","permalink":"https://w3shubh.com/posts/transcend-labs/","summary":"Built a full-stack inventory management solution and a price-tracking Repricer app with SMS notifications for eBay, Amazon, and Discogs.","title":"Transcend Labs — Marketplace Automation \u0026 Inventory Management"},{"content":"Faculty360 is a platform designed to streamline the process of faculty recruitment and management for educational institutions.\nWhat I Built I developed the frontend for the platform, creating a responsive interface that enabled:\nFaculty profile management and discovery Job posting and application workflows for institutions Search and filter functionality across faculty databases Dashboard views for both institutions and candidates Technologies Used HTML · CSS · Bootstrap · JavaScript\n","permalink":"https://w3shubh.com/posts/faculty360/","summary":"Developed the frontend for a faculty recruitment and management platform for educational institutions.","title":"Faculty360 — Faculty Recruitment Platform"},{"content":"Hey, I\u0026rsquo;m Shubham 👋 I\u0026rsquo;m a Full Stack \u0026amp; AI Engineer with 6+ years of experience building scalable products — from reactive frontends to LLM-powered backend systems processing millions of signals daily.\nI graduated from IIT Roorkee (Integrated M.Tech, 2014–2019) and have since worked across startups, a Berlin-based blockchain company, and a Series D SaaS unicorn.\nWhat I Do 🤖 AI \u0026amp; LLM Engineering Designing and shipping production AI systems — prompt engineering, multi-layer guardrails, hallucination detection, feedback loops, and credit-based usage systems. Built systems generating 20,000+ AI summaries and 1,500+ AI-written emails daily.\n⚛️ Frontend Architecture Architecting large-scale React/TypeScript SPAs, reusable component libraries, and complex data visualizations. Led the frontend for a Sales Intelligence platform generating $20M+ in revenue.\n🔧 Backend \u0026amp; Big Data Django APIs, PySpark data pipelines, Hive/Hadoop clusters, and microservices. Optimized critical DAGs, decommissioned legacy systems saving $130K+ annually.\nSkills Category Technologies AI \u0026amp; Data OpenAI, Prompt Engineering, LLM Guardrails, PySpark, Hadoop, Hive, MySQL Languages \u0026amp; Frameworks Python, TypeScript, JavaScript, React, Vue, Django, Flask Architecture \u0026amp; DevOps Microservices, CI/CD, AWS SES, AWS EC2, GraphQL, Apollo Career Timeline Period Role Company 2020 – 2025 Full Stack Developer → AI Engineer 6sense 2020 Founding Engineer OrangeSomething (incubated at Blinkit) 2018 – 2019 Frontend Developer Ape Unit GmbH, Berlin 2016 – 2020 Freelance Developer TransformNow, Transcend Labs, Faculty360 Education 🎓 Indian Institute of Technology, Roorkee Integrated B.Tech + M.Tech in Geological Technology (Earth Sciences) 2014 – 2019\nGet in Touch 📧 ror.shubh@gmail.com 🔗 LinkedIn 🐙 GitHub 🌐 w3shubh.com ","permalink":"https://w3shubh.com/about/","summary":"About Shubham","title":"About"}]