I'm conducting academic research on labor market analysis and need an ML/NLP specialist to help with algorithm development. I have a working data pipeline collecting 100K+ job postings and need help with the machine learning components.
Current Infrastructure (Already Built)
- PostgreSQL database with pgvector (123K+ job postings)
- Job scraper collecting from multiple job posting plaforms
- MLflow for experiment tracking
- Streamlit dashboard for visualization
Help Needed (Algorithm Work)
- Text Embeddings: Optimize embedding generation for job descriptions (BERT, BGE, E5, OpenAI embeddings)
- Clustering: Implement and tune clustering algorithms (K-means, HDBSCAN, hierarchical) for occupational grouping
- Taxonomy Mapping: Align clusters with O*NET/SOC occupational codes
- Skills Extraction: NLP pipeline for extracting skills from job descriptions
- Evaluation: Develop metrics for cluster quality and taxonomy alignment
Ideal Candidate
- Strong Python experience (scikit-learn, sentence-transformers, PyTorch)
- Experience with text embeddings and semantic similarity
- Knowledge of clustering algorithms and evaluation metrics
- Familiarity with labor market data or occupational taxonomies is a plus
- Academic research background preferred
Apply Now
Apply Now