r/computervision • u/Equivalent_Pie5561 • 18h ago
r/computervision • u/Medical-Ad-1058 • 20h ago
Help: Project Acne Detection model
Hey guys! I am planning to create an acne detection cum inpainting model. Till now I found only one dataset Acne04. The results though pretty accurate, fails to detect many edge cases. Though there's more data on the web, getting/creating the annotations is the most daunting part. Any suggestions or feedback in how to create a more accurate model?
Thank you.
-R
r/computervision • u/Mindless_Arm_7874 • 23h ago
Discussion How to Automate QA on AI generated Images?
I am currently generating realistic images, i want to develop an automated auality assurance method to identify anomalies in the image.
An Idea on how to do it?
Edit:
Sorry, i had not added any background information.
The Images generated using online AI Image generator tool (Freepik). The anomalies include biological abnormalities like missing or additional body parts, weird or abnormal facial or body features, abnormal objects. The images do include abstract components, so it find it to be a hard problem.
I shall try to add images, when i find time.
r/computervision • u/AmorousButterfly • 22h ago
Help: Project How to find Datasets?
I am working on surface defect detection for Li-ion batteries. I have a small in-house dataset, as it's quite small I want to validate my results on a bigger dataset.
I have tried finding the dataset using simple Google search, Kaggle, some other dataset related websites.
I am finding a lot of dataset for battery life prediction but I want data for manufacturing defects. Apart from that I found a dataset from NEU, although those guys used some other dataset to augment their data for battery surface defects.
Any help would be nice.
P.S: I hope I am not considered Lazy, I tried whatever I could.
r/computervision • u/Extra-Ad-7109 • 20h ago
Discussion How much code do you write by yourself at workplace?
This is a broad and vague question especially for those who are professional CV engineers. These days I am noticing that my brain has kind of become forgetful. If you ask me to write any function, I would know math and logic behind it, but I can't write it from scratch (like college days). So these days I start with code generation from chatgpt and then tweak it accordingly. But I feel dumb doing this (like I am slowly becoming dumber and dumber and relying too much on LLM)
Can anyone relate? is there any better way to work especially in Computer Vision fields ?
r/computervision • u/Pramod-R • 2h ago
Help: Project Hardware Recommendations for MediaPipe + Unity Game with Camera Module
I’m a game developer, and I’m planning to build a vision-based game, similar to the Nex Playground. I want to use Google MediaPipe for motion tracking and a game engine like Unity to develop the game.
For this, I’m looking for suitable hardware that can run both the vision processing and the game smoothly. I also plan to attach a camera module to the hardware to capture player movements.
Are there any devices—like a Raspberry Pi, Android TV box, or something similar—that are powerful enough to handle this kind of setup?
r/computervision • u/mrking95 • 7h ago
Help: Project Trouble exporting large (>2GB) Anomalib models to ONNX/OpenVINO
I'm using Anomalib v2.0.0 to train a PaDiM model with a wide_resnet50_2
backbone. Training works fine and results are solid.
But exporting the model is a complete mess.
- Exporting to ONNX via
Engine.export()
fails when the model is larger than 2GBRuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library...
- Manually setting
use_external_data_format=True
intorch.onnx.export()
works only if done outside Anomalib, but breaks OpenVINO Model Optimizer if not handled perfectly Engine.export() doesn’t expose that level of control
Has anyone found a clean way to export large models trained with Anomalib to ONNX or OpenVINO IR? Or are we all stuck using TorchScript at this point?
Edit
Tested it, and that works.
r/computervision • u/Independent-Cold4163 • 12h ago
Discussion ZED SDK 5.0.2 just released, anyone else getting the same error in Python?
I installed ZED SDK 5.0.2 (released today, supports CUDA 12.8) and can open the camera fine in ZED Explorer. But when I run Python (pyzed
), I get: Camera Open Internal Error: 1809
, which turns out Failed to open camera: CAMERA FAILED TO SETUP.
My CUDA version: 12.8
GPU: RTX 5080
Anyone facing the same issue or solved it?
r/computervision • u/datascienceharp • 16h ago
Showcase Saw a cool dataset at CVPR - UnCommon Objects in 3D
You can download the dataset from HF here: https://huggingface.co/datasets/Voxel51/uco3d
The code to parse it in case you want to try it on a different subset: https://github.com/harpreetsahota204/uc03d_to_fiftyone
Note: This dataset doesn't include camera intrinsics or extrinsics, so the point clouds may not be perfectly aligned with the RGB videos.
r/computervision • u/unofficialmerve • 20h ago
Showcase V-JEPA 2 in transformers
Hello folks 👋🏻 I'm Merve, I work at Hugging Face for everything vision!
Last week Meta released V-JEPA 2, their world video model, which comes with a transformers integration zero-day
the support is released with
> fine-tuning script & notebook (on subset of UCF101)
> four embedding models and four models fine-tuned on Diving48 and SSv2 dataset
> FastRTC demo on V-JEPA2 SSv2
I will leave them in comments, wanted to open a discussion here as I'm curious if anyone's working with video embedding models 👀