本文将为您提供关于CS6476COMPUTERVISION的详细介绍,同时,我们还将为您提供关于3DComputerVision、ASelf-OrganizedComputerVirusDemoinC
本文将为您提供关于CS 6476 COMPUTER VISION的详细介绍,同时,我们还将为您提供关于3D Computer Vision、A Self-Organized Computer Virus Demo in C、android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错、awesome computer vision repo的实用信息。
本文目录一览:- CS 6476 COMPUTER VISION
- 3D Computer Vision
- A Self-Organized Computer Virus Demo in C
- android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错
- awesome computer vision repo
CS 6476 COMPUTER VISION
GEORGIA TECH’S CS 6476 COMPUTER VISION
Final Project : Classification and Detection with
Convolutional Neural Networks
April 1, 2023
PROJECT DESCRIPTION AND INSTRUCTIONS
Description
For this topic you will design a digit detection and recognition system which takes in a single
image and returns any sequence of digits visible in that image. For example, if the input image
contains a home address 123 Main Street, you algorithm should return “123”. One step in your
processing pipeline must be a Convolutional Neural Network (CNN) implemented in TensorFlow or PyTorch . If you choose this topic, you will need to perform additional research about
CNNs. Note that the sequences of numbers may have varying scales, orientations, and fonts,
and may be arbitrarily positioned in a noisy image.
Sample Dataset: http://ufldl.stanford.edu/housenumbers/
Related Lectures (not exhaustive): 8A-8C, 9A-9B
Problem Overview
Methods to be used: Implement a Convolutional Neural Network-based method that is capable of detecting and recognizing any sequence of digits visible in an image.
RULES:
• Don’t use external libraries for core functionality You may use TensorFlow, keras and Pytorch and are even required to use pretrained models as part of your pipeline.
• However, you will receive a low score if the main functionality of your code is provided
via an external library.
• Don’t copy code from the internet The course honor code is still in effect during the final
project. All of the code you submit must be your own. You may consult tutorials for
libraries you are unfamiliar with, but your final project submission must be your own
work.
1
• Don’t use pre-trained machine learning pipelines If you choose a topic that requires the
use of machine learning techniques, you are expected to do your own training. Downloading and submitting a pre-trained models that does all the work is not acceptable for
this assignment. For the section on reusing pre-trained weights you expected to use a
network trained for another classification task and re-train it for this one.
• Don’t rely on a single source We want to see that you performed research on your chosen topic and incorporated ideas from multiple sources in your final results. Your project
must not be based on a single research paper and definitely must not be based on a single
online tutorial.
Please do not use absolute paths in your submission code. All paths must be relative
to the submission directory. Any submissions with absolute paths are in danger of receiving a penalty!
Starter Code
There is no starter code for this project
Programming Instructions
In order to work with Convolutional Neural Networks we are providing a conda environment
description with the versions of the libraries that the TA will use in the grading environment
in canvas->files->Project files. This environment includes PyTorch, Tensorflow, Scikit-learn,
and SciPy. You may use any of these. It is your responsibility to use versions of libraries that
are compatible with those in the environment. It is also up to you to organize your files and
determine the code’s structure. The only requirement is that the grader must only run one
file to get your results. This, however, does not prevent the use of helper files linked to this
main script. The grader will not open and run multiple files. Include a README.md file with
usage instructions that are clear for the grader to run your code.
Write-up Instructions
The report must be a PDF of 4-6 pages including images and references. Not following this
requirement will incur a significant penalty and the content will be graded only up to page 6.
Note that the report will be graded subject to a working code. There will be no report templates
provided with the project materials.
The report must contain:
You report must be written to show your work and demonstrate a deep understanding of your
chosen topic. The discussion in your report must be technical and quantitative wherever possible.
• A clear and concise description of the algorithms you implemented. This description
must include references to recently published computer vision research and show a deep
understanding of your chosen topic.
• Results from applying your algorithm to images or video. Both positive and negative results must be shown in the report and you must explain why your algorithm works on
some images, but not others.
2
How to Submit
Similar to the class assignments, you will submit the code and the report to Gradescope (note:
there will be no autograder part). Find the appropriate project and make your submission into
the correct project. Important: Submissions sent to Email, Piazza or anything that is not
Gradescope will not be graded.
Grading
The report will be graded following the scheme below:
• Code (30%): We will verify that the methods and rules indicated above have been followed.
• Report (70%): Subject to a working code.
• Description of existing methods published in recent computer vision research.
• Description of the method you implemented.
• Results obtained from applying your algorithms to images or videos.
• Analysis on why your method works on some images and not on others. (with images)
• References and citations.
ASSIGNMENT OVERVIEW
This project requires you to research how Convolutional Neural Networks work and their application to number detection and recognition. This is not to be a replica of a tutorial found
online. Keep in mind this content is not widely covered in this course lectures and resources.
The main objective of this assignment is to demonstrate your understanding of how these tools
work. We allow you to use a very powerful training framework that helps you to avoid many of
the time-consuming implementation details because the emphasis of this project will be on
the robustness of your implementation and in-depth understanding of the tools you are using.
Installation and Compatibility
The provided environment yml description gives you with the versions of the libraries the TA’s
will during grading. We recommend you use conda to install the environment. Make sure the
forward pass of your pipeline runs in a reasonable amount of time when using only a CPU as
some TA’s do not have a GPU.
OS Warning:
Be warned that TA’s may grade on linux, Windows or Mac machines. Thus, it is your responsibility to make sure that your code is platform independent. This is particularly important when
using paths to files. If your code doesn’t run during grading due to some incompatibility you
will incur a penalty.
Classifier Requirements
Your classification pipeline must be robust in the following ways:
- Scale Invariance:
3
The scale of the sequence of numbers in an image in vary. - Location Invariance:
The location of the sequence of numbers in the image may vary. - Font Invariance:
You are expected to detect numbers despite their fonts. - Pose Invariance:
The sequence of numbers can be at any angle with respect to the frame of the image. - Lighting Invariance:
We expect robustness to the lighting conditions in which the image was taken. - Noise Invariance:
Make sure that your pipeline is able to handle gaussian noise in the image.
Pipeline Overview:
The final pipeline should incorporate the following preprocessing and classification components. We expect you to clearly explain in your report what you did at each stage and why.
Preprocessing
Your pipeline should start from receiving an image like this:
Notice that this is not the type of image your classification network trained on. You will have to
do some preprocessing to correctly detect the number sequence in this image.
In the preprocessing stage your algorithm should take as input an image like the one above and
return region of interest. Those ROI will be regions in the image where there is a digit. In order
to perform this preprocessing step you can use the MSER and/or sliding window algorithm with
image pyramid approach. (see https://docs.opencv.org/4.1.0/d3/d28/classcv_1_1MSER.html)
Note: The region proposal stage has to be separated from the classification stage. For this
project we will use MSER and/or sliding window to detect the ROI. This means that one-stage
approaches (detection + classification) such as YOLO are not allowed.
4
Noise Management
We expect to see you handle gaussian noise and varying lighting conditions in the image. Please
explain what you do in order to handle these types of perturbations and still have your classifier
work.
Location Invariance
Since you don’t know where the numbers will appear on the image you will have to search for
them using a sliding window method.
Scale Invariance
Make sure to implement an image pyramid with non-maxima suppression to detect numbers
at any scale.
Performance Considerations
Running your full classifier through a sliding window can be very expensive. Did you do anything to mitigate forward pass runtime?
Classification
This section is concerned with the implementation of a number classifier based on the sample
dataset.
Model Variation
There are several approaches to implementing a classifier and we want you get exposure to all
of them: - Make your own architecture and train it from scratch.
(https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial...) (without pre-trained weights). - Use a VGG 16 implementation and train it with pre-trained weights.
(Note: Final Linear layer will have 11 classes,
https://pytorch.org/tutorials/beginner/transfer_learning_tuto...(finetuning-the-convnet)
Make sure you mention in your report what changes you made to the VGG16 model in order to
use it for your particular classification task. What weights did you reuse and why? Did you train
over the pre-trained weights?
Training Variation
We want you to have some familiarity with stochastic gradient descent. For this reason we
want you to explain your choice of loss function during training. We also want an explanation
for your choice of batch size and learning rate. In the report we expect a definition of these
parameters and an explanation of why you chose the numbers you did. We also want to see
5
how you decided to stop the training procedure.
Evaluating Performance
In order to evaluate the performance of your learning model we expect you to include training curves with validation, training and test set errors. When you compare the performance of
each model we also want you include tables with the test set performance of the each model.
We want to see a discussion of your performance in each of the models outlined above and we
want to see empirical data demonstrating which is better. Your final pipeline should use the
model and training that empirically demonstrates better performance.
FINAL RESULTS
Image Classification Results
During grading, TAs expect to be able to run a python 3 file named run.py that writes five images to a graded_images folder in the current directory. The images should be named 1.png,
2.png, 3.png, 4.png and 5.png.
You can pick these images; however, across the five of them we will be checking that you
demonstrate following: - Correct classification at different scales
- Correct classification at different orientations
- Correct classification at different locations within the image.
- Correct classification with different lighting conditions.
Notice, that since we allow you to pick the images, we expect good results.
In addition, add extra images showing failure cases of your implementation in the report. Analyse and comment why your algorithm is failing on those images.
WX:codinghelp
3D Computer Vision
3D Computer Vision
Programming Assignment 2 – Epipolar Geometry
You will upload your codes and a short report (PDF) in a zip file to the NewE3 system. Grading will be done
at demo time (face-to-face or Skype).
A C++ Visual Studio project is provided. To build the code, install VS 2019 (Community). When open the
solution file (project2.sln), be sure NOT to upgrade the Windows SDK Version nor the Platform Toolset:
The project should be buildable and runnable on a Windows system. Your tasks are:
- [2p] For the test stereo images (pictures/stereo1_left.png , stereo1_right.png), find 8 matching pairs of
2D points. List them as g_matching_left and g_matching_right. Note: x and y are in [-1,1] range. You
can define the matching manually or
[Bonus: +1~2p to mid-term] use off-the-shelf matching methods (such as OpenGL feature matching or
others). The bonus amount depends on how well you understood and explains your matching method. - [5p] Implement the normalized eight-point method in EpipolarGeometry() to calculate the fundamental
matrix (same as essential matrix). Remember to fill your result in g_epipolar_E To verify your result, the
eight “*multiply:” stdout should output values very close to zero (around e-6 ~ e-7). The rendering
should look like:
(Here the 8 matching are the 8 vertices of the “cube”. But your matching can be anything.) - [1p] Explain what line 382-389 do? What does the “multiply” result means? Why should all the multiply
values be (close to) zero? - [3p] Download the OpenCV sfm module source code at https://github.com/opencv/ope... Go
to \modules\sfm\src\libmv_light\libmv\multiview. Explain the following functions:
FundamentalFromEssential () in fundamental.cc [1p].
MotionFromEssential() in fundamental.cc [1p].
P_From_KRt () in projection.cc [1p].
Note: “HZ” means the textbook “Multiple View Geometry in Computer Vision” by Richard Hartley and
Andrew Zisserman and a pdf is provided for your reference.
WX:codehelp
A Self-Organized Computer Virus Demo in C
A Program that can modify herself and copy herself to somewhere else and execute it. Just like gene-transformable virus.
Here''s the sample code.
#include "stdio.h"
#include "windows.h"
#define MAX_LOOP 10000
int main(int argc, const char** argv)
{
/*** THE VIRUS LOGIC PART ***/
//GENE_MARK
int a = 0;
printf("GENE PRINT:%d\n", a);
/*** THE VIRUS LOGIC PART ***/
// FILE NAME
char file_name[30] = "test";
// MAKE A COPY OF HERSELF
FILE* source = fopen("test.c", "r");
FILE* descendant = fopen("descendant.c", "w+");
printf("SOURCE FILE OPEN RESULT IS : %d \n", (int)source);
printf("DESCENDANT FILE CREATED: %d \n", (int)descendant);
if(descendant==NULL)
{
printf("ERROR ON CREATING DESCENDANT.\n");
return -1;
}
char buff[100] = {0};
// REPLACE GENE MARK PROGRAM
char letter = 0;
// GENE LINE
int idx = 0;
int loop = 0;
int buff_idx = 0;
while(!feof(source))
{
// ALARM
if(loop>MAX_LOOP)
break;
loop ++;
fread(&letter, sizeof(char), 1, source);
buff[buff_idx] = letter;
buff_idx ++;
if(letter==''\n'')
{
if(idx==9)
{
// TRANSFORM GENE
memset(buff, 0, 100);
buff_idx = 0;
strcat(buff, "int a = 1;\n");
}
fwrite(buff, sizeof(char), strlen(buff), descendant);
// CLEAR BUFFER
memset(buff, 0, 100);
buff_idx = 0;
idx ++;
}
}
// DEAL WITH LEFT LETTERS IN BUFFER
if(strlen(buff)>0)
{
strcat(buff, "\n");
fwrite(buff, sizeof(char), strlen(buff)+1, descendant);
}
// CLOSE ALL FILES
fclose(source);
fclose(descendant);
// until the descendant file is written over
/*** COMPILE HERSELF ***/
char* source_file = "descendant.c";
char* dest_file = "descendant.exe";
char command[100] = {0};
strcat(command, "gcc -o ");
strcat(command, dest_file);
strcat(command, " ");
strcat(command, source_file);
// COMPILATION
system(command);
/***********************/
printf("COPYING MYSELF DONE.\n");
printf("WAITING FOR NEXT INSTRUCTION...\n");
char cmd = getchar();
if(cmd==''Y'')
{
printf("BEGIN EXECUTE THE COPYFILE EXECUTION...\n");
//GENE_MARK
system("descendant.exe");
printf("EXECUTION PROCESS IS ACTIVATED, TASK DONE. EXIT SYSTEM.");
}
else
printf("YOU CHOOSE TO EXIT SYSTEM. BYE!");
return 0;
}
If you have any suggestions or ideas, please feel free comment below, thanks!
android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错
如何解决android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错?
我在使用Google视觉OCR库的android studio中遇到问题。 这是错误:
W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.dynamite.ocr not found.
I/DynamiteModule: Considering local module com.google.android.gms.vision.dynamite.ocr:0 and remote module com.google.android.gms.vision.dynamite.ocr:0
W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.ocr not found.
I/DynamiteModule: Considering local module com.google.android.gms.vision.ocr:0 and remote module com.google.android.gms.vision.ocr:0
E/Vision: Error loading optional module com.google.android.gms.vision.ocr: com.google.android.gms.dynamite.DynamiteModule$LoadingException: No acceptable module found. Local version is 0 and remote version is 0.
你能帮我吗?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)
awesome computer vision repo
https://blog.csdn.net/guoyunfei20/article/details/88530159
# AwesomeComputerVision
**Multi-Object-Tracking-Paper-List**
https://github.com/SpyderXu/multi-object-tracking-paper-list
**awesome-object-detection**
https://github.com/hoya012/deep_learning_object_detection
**awesome-image-classification**
https://github.com/weiaicunzai/awesome-image-classification
**Visual-Tracking-Paper-List**
https://github.com/foolwood/benchmark_results
**awesome-semantic-segmentation**
https://github.com/mrgloom/awesome-semantic-segmentation
**awesome-human-pose-estimation**
https://github.com/cbsudux/awesome-human-pose-estimation
**awesome-Face-Recognition**
————————————————
版权声明:本文为CSDN博主「guoyunfei20」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/guoyunfei20/article/details/88530159
我们今天的关于CS 6476 COMPUTER VISION的分享就到这里,谢谢您的阅读,如果想了解更多关于3D Computer Vision、A Self-Organized Computer Virus Demo in C、android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错、awesome computer vision repo的相关信息,可以在本站进行搜索。
本文标签: