About the Job
As part of our next phase of innovation, we are building a decentralized AI dataset pool to empower users to query, visualize, and use these ready-made datasets for fine-tuning and training their AI models—all within our platform. This pool will also allow external contributors to share datasets and earn revenue based on data quality, quantity, and usage.
- Job type: Full-time & Remote. Flexible working hours from Monday to Saturday, ensuring 44+hrs. We need A players, who can work like hell, if you prefer a 9-5 job, this might not be a suitable company for you.
- Location: Work from anywhere.
- Reporting line: CTO
Your Responsibilities
We are looking for a skilled Senior Data Infrastructure Engineer to lead the development of our decentralized AI dataset pool. This individual will play a critical role in organizing and structuring our current unstructured dataset stored on S3, designing user-facing tools for data interaction, and implementing a revenue-sharing system for contributors. You will have to work closely with the blockchain team to handle the storage as well as revenue-sharing mechanism for external contributors. The ideal candidate has expertise in building scalable data systems, integrating APIs with platforms, and working with AI/ML workflows. Experience in decentralized technologies such as IPFS is a plus.
Key responsibilities include:
- Data Structuring & Management: Transform unstructured datasets stored on S3 into a structured, queryable, and accessible format. Develop efficient data pipelines and systems for data ingestion, transformation, and management.
- Platform Integration: Build tools for users to query datasets, visualize data insights, test data quality. Enable users to fine-tune and train their models using selected datasets without downloading data, ensuring usage remains within the current AIxBlock platform.
- Contributions & Revenue Sharing: Implement mechanisms for external contributors to submit datasets. Develop a system to calculate and distribute revenue shares based on dataset quality, quantity, and usage. In this task, you will have to work closely with the blockchain team.
- Data Security & Compliance: Ensure robust data security, privacy, and compliance with applicable regulations. Implement access controls and audit trails for data usage.
- Scalability & Decentralization: Design and implement decentralized storage solutions (e.g., IPFS, Arweave) to align with AIxBlock’s vision. Ensure scalability to handle large-scale datasets and user interactions.
Requirements
Technical Skills:
- Proven experience in data engineering, particularly with unstructured data.
- Strong expertise in AWS S3, databases, and data querying tools.
- Proficiency in building and integrating APIs for data interaction.
- Hands-on experience with data visualization tools Familiarity with machine learning workflows and tools.
- Knowledge of decentralized storage solutions and blockchain technologies (preferred).
Compensation & Benefits
Compensation: Base salary: Negotiated salary depending on experience. Token bonus based on Performance
Benefits:
- Salary review depending on the performance
- Birthday gift
- Holiday gift
- Year-End Performance Bonus (Cash)
- Year-end party
... (more benefits listed in the original text)
Application Process
Resume & Portfolio screening
Interview with the TA
Interview with the CTO
Offer discussion and contract Signing
Please note: We're all about remote work and have collaborators based all around the world, and English is our primary language. Therefore, English CV is required. The application process may be slightly modified (shortened or prolonged) when necessary.