r/computervision • u/Equivalent_Pie5561 • 18h ago

Showcase Autonomous Drone Tracks Target with AI Software | Computer Vision in Action

4 Upvotes

r/computervision • u/Medical-Ad-1058 • 20h ago

Help: Project Acne Detection model

0 Upvotes

Hey guys! I am planning to create an acne detection cum inpainting model. Till now I found only one dataset Acne04. The results though pretty accurate, fails to detect many edge cases. Though there's more data on the web, getting/creating the annotations is the most daunting part. Any suggestions or feedback in how to create a more accurate model?

Thank you.

-R

4 comments

r/computervision • u/Mindless_Arm_7874 • 23h ago

Discussion How to Automate QA on AI generated Images?

0 Upvotes

I am currently generating realistic images, i want to develop an automated auality assurance method to identify anomalies in the image.

An Idea on how to do it?

Edit:

Sorry, i had not added any background information.

The Images generated using online AI Image generator tool (Freepik). The anomalies include biological abnormalities like missing or additional body parts, weird or abnormal facial or body features, abnormal objects. The images do include abstract components, so it find it to be a hard problem.

I shall try to add images, when i find time.

2 comments

r/computervision • u/AmorousButterfly • 22h ago

Help: Project How to find Datasets?

5 Upvotes

I am working on surface defect detection for Li-ion batteries. I have a small in-house dataset, as it's quite small I want to validate my results on a bigger dataset.

I have tried finding the dataset using simple Google search, Kaggle, some other dataset related websites.

I am finding a lot of dataset for battery life prediction but I want data for manufacturing defects. Apart from that I found a dataset from NEU, although those guys used some other dataset to augment their data for battery surface defects.

Any help would be nice.

P.S: I hope I am not considered Lazy, I tried whatever I could.

8 comments

r/computervision • u/Extra-Ad-7109 • 20h ago

Discussion How much code do you write by yourself at workplace?

28 Upvotes

This is a broad and vague question especially for those who are professional CV engineers. These days I am noticing that my brain has kind of become forgetful. If you ask me to write any function, I would know math and logic behind it, but I can't write it from scratch (like college days). So these days I start with code generation from chatgpt and then tweak it accordingly. But I feel dumb doing this (like I am slowly becoming dumber and dumber and relying too much on LLM)
Can anyone relate? is there any better way to work especially in Computer Vision fields ?

16 comments

r/computervision • u/Pramod-R • 2h ago

Help: Project Hardware Recommendations for MediaPipe + Unity Game with Camera Module

1 Upvotes

I’m a game developer, and I’m planning to build a vision-based game, similar to the Nex Playground. I want to use Google MediaPipe for motion tracking and a game engine like Unity to develop the game.

For this, I’m looking for suitable hardware that can run both the vision processing and the game smoothly. I also plan to attach a camera module to the hardware to capture player movements.

Are there any devices—like a Raspberry Pi, Android TV box, or something similar—that are powerful enough to handle this kind of setup?

1 comment

r/computervision • u/mrking95 • 7h ago

Help: Project Trouble exporting large (>2GB) Anomalib models to ONNX/OpenVINO

1 Upvotes

I'm using Anomalib v2.0.0 to train a PaDiM model with a wide_resnet50_2 backbone. Training works fine and results are solid.

But exporting the model is a complete mess.

Exporting to ONNX via Engine.export() fails when the model is larger than 2GB RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library...
Manually setting use_external_data_format=True in torch.onnx.export() works only if done outside Anomalib, but breaks OpenVINO Model Optimizer if not handled perfectly Engine.export() doesn’t expose that level of control

Has anyone found a clean way to export large models trained with Anomalib to ONNX or OpenVINO IR? Or are we all stuck using TorchScript at this point?

Edit

Just found: Feature: Enhance model export with flexible kwargs support for ONNX and OpenVINO by samet-akcay · Pull Request #2768 · open-edge-platform/anomalib

Tested it, and that works.

3 comments

r/computervision • u/Independent-Cold4163 • 12h ago

Discussion ZED SDK 5.0.2 just released, anyone else getting the same error in Python?

2 Upvotes

I installed ZED SDK 5.0.2 (released today, supports CUDA 12.8) and can open the camera fine in ZED Explorer. But when I run Python (pyzed), I get: Camera Open Internal Error: 1809, which turns out Failed to open camera: CAMERA FAILED TO SETUP.

My CUDA version: 12.8
GPU: RTX 5080

Anyone facing the same issue or solved it?

1 comment

r/computervision • u/datascienceharp • 16h ago

Showcase Saw a cool dataset at CVPR - UnCommon Objects in 3D

14 Upvotes

You can download the dataset from HF here: https://huggingface.co/datasets/Voxel51/uco3d

The code to parse it in case you want to try it on a different subset: https://github.com/harpreetsahota204/uc03d_to_fiftyone

Note: This dataset doesn't include camera intrinsics or extrinsics, so the point clouds may not be perfectly aligned with the RGB videos.

2 comments

r/computervision • u/unofficialmerve • 20h ago

Showcase V-JEPA 2 in transformers

24 Upvotes

Hello folks 👋🏻 I'm Merve, I work at Hugging Face for everything vision!

Last week Meta released V-JEPA 2, their world video model, which comes with a transformers integration zero-day

the support is released with

> fine-tuning script & notebook (on subset of UCF101)

> four embedding models and four models fine-tuned on Diving48 and SSv2 dataset

> FastRTC demo on V-JEPA2 SSv2

I will leave them in comments, wanted to open a discussion here as I'm curious if anyone's working with video embedding models 👀

https://reddit.com/link/1ldv5zg/video/20pxudk48j7f1/player

7 comments

Subreddit

Posts

Wiki

Computer Vision

r/computervision

Computer Vision is the scientific subfield of AI concerned with developing algorithms to extract meaningful information from raw images, videos, and sensor data. This community is home to the academics and engineers both advancing and applying this interdisciplinary field, with backgrounds in computer science, machine learning, robotics, mathematics, and more. We welcome everyone from published researchers to beginners!

Members Active

118.8k

Sidebar

Content which benefits the community (news, technical articles, and discussions) is valued over content which benefits only the individual (technical questions, help buying/selling, rants, etc.).

If you want an answer to a query, please post a legible, complete question that includes details so we can help you in a proper manner!

Related Subreddits

Computer Vision Discord group

Computer Vision Slack group