Kuldeep Singh Sidhu's picture
5 3

Kuldeep Singh Sidhu


AI & ML interests

Seeking contributors for a completely open-source 🚀 Data Science platform! singhsidhukuldeep.github.io


Posts 58

view post
Researchers from Tencent have developed DepthCrafter, a novel method for generating temporally consistent long depth sequences for open-world videos using video diffusion models.

It leverages a pre-trained image-to-video diffusion model (SVD) as the foundation and uses a 3-stage training strategy on paired video-depth datasets:
1. Train on a large realistic dataset (1-25 frames)
2. Fine-tune temporal layers on realistic data (1-110 frames)
3. Fine-tune spatial layers on synthetic data (45 frames)

It adapts SVD's conditioning mechanism for frame-by-frame video input and employs latent diffusion in VAE space for efficiency.
Sprinkle some intelligent inference strategy for extremely long videos:
- Segment-wise processing (up to 110 frames)
- Noise initialization to anchor depth distributions
- Latent interpolation for seamless stitching

And outperforms SOTA methods on multiple datasets (Sintel, ScanNet, KITTI, Bonn).

Read here: https://depthcrafter.github.io
view post
If you're passionate about the latest in AI, self-driving technology, and humanoid robotics, you need to catch this episode featuring Andrej Karpathy, he discusses OpenAI, Tesla, and education. It's 44 minutes, but you might have to slow it down based on how fast he speaks!

Key Insights:

1. Self-Driving Cars as a Bridge to AGI:
Andrej explores the parallels between self-driving technology and Artificial General Intelligence (AGI), suggesting that in some respects, AGI has already been achieved within the realm of self-driving. Tesla’s approach, which emphasizes software over expensive hardware like LIDAR, exemplifies this.

2. Tesla vs. Waymo: The Battle of Approaches:
Tesla relies on vision-based systems with minimal sensors, leveraging advanced neural networks for decision-making. This contrasts sharply with Waymo's sensor-heavy vehicles, highlighting a broader software versus hardware challenge that could define the future of scalable autonomous driving.

3. End-to-End Deep Learning:
Andrej highlights the transition from manually programmed systems to fully end-to-end deep learning models that "eat through the stack." At Tesla, this shift has significantly reduced reliance on C++ code, making neural networks the driving force in software and hardware integration.

4. Humanoid Robotics - More Than Just a Dream:
The shift from Tesla’s automotive neural networks to humanoid robots like Optimus is nearly seamless. By using the same sensors and computational platforms, Tesla is redefining what a robotics company can achieve at scale, bridging the gap between vehicle AI and human-like robotics.


5. The Power of Transformers in AI

6. Synthetic Data: The Future of AI Training

7. AI for Education - A Revolutionary Approach

Full super fast speech is here: https://youtu.be/hM_h0UA7upI


None public yet


None public yet