GVKun编程网logo

android-vision – 使用Google的移动视觉API只检测数字?

1

对于想了解android-vision–使用Google的移动视觉API只检测数字?的读者,本文将提供新的信息,并且为您提供关于AVisionforMakingDeepLearningSimple、A

对于想了解android-vision – 使用Google的移动视觉API只检测数字?的读者,本文将提供新的信息,并且为您提供关于A Vision for Making Deep Learning Simple、AI 应用之 Custom Vision 检测是否戴口罩、android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错、android – Google Vision API – 关于空对象引用的Face方法的有价值信息。

本文目录一览:

android-vision – 使用Google的移动视觉API只检测数字?

android-vision – 使用Google的移动视觉API只检测数字?

我想知道如何过滤以仅检测数字(整数)?例如1,2,….,10.目前,api检测所有格式的“文本”.

解决方法

你应该在你身边做这个处理.使用 Regex过滤掉从Vision API收到的字符串中的数字:

str="Text received 123,0";
number = str.replace(/\D/g,'''');

result: 123

A Vision for Making Deep Learning Simple

A Vision for Making Deep Learning Simple

A Vision for Making Deep Learning Simple

When MapReduce was introduced 15 years ago, it showed the world a glimpse into the future. For the first time, engineers at Silicon Valley tech companies could analyze the entire Internet. MapReduce, however, provided low-level APIs that were incredibly difficult to use, and as a result, this “superpower” was a luxury — only a small fraction of highly sophisticated engineers with lots of resources could afford to use it.

Today, deep learning has reached its “MapReduce” point: it has demonstrated its potential; it is the “superpower” of Artificial Intelligence. Its accomplishments were unthinkable a few years ago: self-driving cars and AlphaGo would have been considered miracles.

Yet leveraging the superpower of deep learning today is as challenging as big data was yesterday: deep learning frameworks have steep learning curves because of low-level APIs; scaling out over distributed hardware requires significant manual work; and even with the combination of time and resources, achieving success requires tedious fiddling and experimenting with parameters. Deep learning is often referred to as “black magic.”

Seven years ago, a group of us started the Spark project with the singular goal to “democratize” the “superpower” of big data, by offering high-level APIs and a unified engine to do machine learning, ETL, streaming and interactive SQL. Today, Apache Spark makes big data accessible to everyone from software engineers to SQL analysts.

Continuing with that vision of democratization, we are excited to announce Deep Learning Pipelines, a new open-source library aimed at enabling everyone to easily integrate scalable deep learning into their workflows, from machine learning practitioners to business analysts.

 

Deep Learning Pipelines builds on Apache Spark’s ML Pipelines for training, and with Spark DataFrames and SQL for deploying models. It includes high-level APIs for common aspects of deep learning so they can be done efficiently in a few lines of code:

  • Image loading
  • Applying pre-trained models as transformers in a Spark ML pipeline
  • Transfer learning
  • Distributed hyperparameter tuning
  • Deploying models in DataFrames and SQL

In the rest of the post, we describe each of these features in detail with examples. To try out these and further examples on Databricks, check out the notebook Deep Learning Pipelines on Databricks.

Image Loading

The first step to applying deep learning on images is the ability to load the images. Deep Learning Pipelines includes utility functions that can load millions of images into a DataFrame and decode them automatically in a distributed fashion, allowing manipulation at scale.

df = imageIO.readImages("/data/myimages")

We are also working on adding support for more data types, such as text and time series.

Applying Pre-trained Models for Scalable Prediction

Deep Learning Pipelines supports running pre-trained models in a distributed manner with Spark, available in both batch and streaming data processing. It houses some of the most popular models, enabling users to start using deep learning without the costly step of training a model. For example, the following code creates a Spark prediction pipeline using InceptionV3, a state-of-the-art convolutional neural network (CNN) model for image classification, and predicts what objects are in the images that we just loaded. This prediction, of course, is done in parallel with all the benefits that come with Spark:

from sparkdl import readImages, DeepImagePredictor
predictor = DeepImagePredictor(inputCol="image", outputCol="predicted_labels", modelName="InceptionV3")
predictions_df = predictor.transform(df)

In addition to using the built-in models, users can plug in Keras models and TensorFlow Graphs in a Spark prediction pipeline. This turns any single-node models on single-node tools into one that can be applied in a distributed fashion, on a large amount of data.

On Databricks’ Unified Analytics Platform, if you choose a GPU-based cluster, the computation intensive parts will automatically run on GPUs for best efficiency.

Transfer Learning

Pre-trained models are extremely useful when they are suitable for the task at hand, but they are often not optimized for the specific dataset users are tackling. As an example, InceptionV3 is a model optimized for image classification on a broad set of 1000 categories, but our domain might be dog breed classification. A commonly used technique in deep learning is transfer learning, which adapts a model trained for a similar task to the task at hand. Compared with training a new model from ground-up, transfer learning requires substantially less data and resources. This is why transfer learning has become the go-to method in many real world use cases, such as cancer detection.

Deep Learning Pipelines enables fast transfer learning with the concept of a Featurizer. The following example combines the InceptionV3 model and logistic regression in Spark to adapt InceptionV3 to our specific domain. The DeepImageFeaturizer automatically peels off the last layer of a pre-trained neural network and uses the output from all the previous layers as features for the logistic regression algorithm. Since logistic regression is a simple and fast algorithm, this transfer learning training can converge quickly using far fewer images than are typically required to train a deep learning model from ground-up.

from sparkdl import DeepImageFeaturizer 
from pyspark.ml.classification import LogisticRegression

featurizer = DeepImageFeaturizer(modelName="InceptionV3")
lr = LogisticRegression()
p = Pipeline(stages=[featurizer, lr])

# train_images_df = ... # load a dataset of images and labels
model = p.fit(train_images_df)

Distributed Hyperparameter Tuning

Getting the best results in deep learning requires experimenting with different values for training parameters, an important step called hyperparameter tuning. Since Deep Learning Pipelines enables exposing deep learning training as a step in Spark’s machine learning pipelines, users can rely on the hyperparameter tuning infrastructure already built into Spark.

The following code plugs in a Keras Estimator and performs hyperparameter tuning using grid search with cross validation:

myEstimator = KerasImageFileEstimator(inputCol=''input'',
                                      outputCol=''output'',
                                      modelFile=''/my_models/model.h5'',
                                      imageLoader=_loadProcessKeras)

kerasParams1 = {''batch_size'':10, epochs:10}
kerasParams2 = {''batch_size'':5, epochs:20}

myParamMaps =
  ParamGridBuilder() \
    .addGrid(myEstimator.kerasParams, [kerasParams1, kerasParams2]) \
    .build()

cv = CrossValidator(myEstimator, myEvaluator, myParamMaps)
cvModel = cv.fit()
kerasTransformer = cvModel.bestModel  # of type KerasTransformer

Deploying Models in SQL

Once a data scientist builds the desired model, Deep Learning Pipelines makes it simple to expose it as a function in SQL, so anyone in their organization can use it – data engineers, data scientists, business analysts, anybody.

sparkdl.registerKerasUDF("img_classify", "/mymodels/dogmodel.h5")

Next, any user in the organization can apply prediction in SQL:

SELECT image, img_classify(image) label FROM images 
WHERE contains(label, “Chihuahua”)

Similar functionality is also available in the DataFrame programmatic API across all supported languages (Python, Scala, Java, R). Similar to scalable prediction, this feature works in both batch and structured streaming.

Conclusion

In this blog post, we introduced Deep Learning Pipelines, a new library that makes deep learning drastically easier to use and scale. While this is just the beginning, we believe Deep Learning Pipelines has the potential to accomplish what Spark did to big data: make the deep learning “superpower” approachable for everybody.

Future posts in the series will cover the various tools in the library in more detail: image manipulation at scale, transfer learning, prediction at scale, and making deep learning available in SQL.

To learn more about the library, check out the Databricks notebook as well as the github repository. We encourage you to give us feedback. Or even better, be a contributor and help bring the power of scalable deep learning to everyone.

AI 应用之 Custom Vision 检测是否戴口罩

AI 应用之 Custom Vision 检测是否戴口罩

语雀知识库:https://www.yuque.com/seanyu/azure/customvision

公众号:

 

 

 

什么是自定义视觉(Custom Vision)什么是自定义视觉(Custom Vision)

自定义视觉是一种认知服务,用于生成、部署和改进自己的图像分类器。自定义视觉允许你确定要应用的标签,比如要对一组水果进行识别,你需要对待检测的水果打上诸如 “苹果”“香蕉”“草莓” 的标签。

作用

自定义视觉服务使用机器学习算法为图像应用标签。 你作为开发人员必须提供多组图像(例如苹果,香蕉,草莓),其中包含或缺少相关的特征。 请在提交时自行标记图像(标记该水果是香蕉还是苹果), 然后,此算法会针对该数据进行训练并计算其自己的准确度,方法是针对这些相同的图像自行进行测试。 训练算法以后,即可对其进行测试、重新训练,并最终使用它根据应用的需求对新图像分类。 也可导出模型本身,方便脱机使用。

分类和对象检测

可以将自定义视觉功能分为两种功能。 图像分类可将一个或多个标签应用到图像。 对象检测与之类似,但还在图像中返回坐标,坐标中可以找到应用的标签。

优化

自定义视觉服务经过优化,可以快速识别图像之间的主要差异,因此你可以使用少量数据开始原型制作。 开始时,每个标签通常可以包含 50 个图像。 此服务不适用于检测图像中的细微差异(例如,在质量保证方案中检测细微裂纹或凹陷)。

另外,可以从多种自定义视觉算法中进行选择,这些算法已针对包含某些主题内容(例如特征点或零售商品)的图像进行了优化。

 

案例 - 使用自定义视觉实现识别是否戴口罩

1. 在 Azure Portal 创建自定义视觉

 

选择自定义视觉进行创建,注意,目前自定义视觉仅在 Global Azure 可用。

训练和预测尽量选择最靠近最终使用位置的区域。

2. 在自定义视觉门户创建 “对象检测” 项目

创建完成后,点击自定义视觉门户进入控制台。

点击新建项目,创建一个 “对象检测类型的项目”

 

 

 

3. 在自定义视觉的 Portal 上传大于 100 张图片,其中 50 张戴口罩,50 张不戴口罩

 

注:本案例中的图像均来自互联网。

 

对图像的一些基本要求:

为了有效地训练模型,请使用具有视觉多样性的图像。 选择在以下方面有所不同的图像:

  • 照相机角度
  • 照明
  • background
  • 视觉样式
  • 个人 / 分组主题
  • size
  • type

此外,请确保所有训练图像满足以下条件:

  • .jpg、.png、.bmp 或 .gif 格式
  • 大小不超过 6 MB (预测图像不超过 4 MB)
  • 最短的边不小于 256 像素;任何小于此像素的图像将通过自定义影像服务自动纵向扩展

 

 

4. 在 Portal 上对图片进行标记,标记为戴口罩和不戴口罩

添加 “mask” 和 “nomask” 两个 Tag

 

依次点击未标记的图像,在口罩部位(建议保留耳朵,眼睛,鼻子等特征)标记 “mask” 或 “nomask”;

标记完成一张后,点击右侧的翻页箭头直至所有图像都标记完成。

5. 进行训练

点击训练按钮,开始训练。

 

 

评估检测器

完成训练后,将计算并显示模型的性能。 自定义视觉服务使用提交用于训练的图像来计算精度、召回率和平均精度。 精度和召回率是检测器有效性的两个不同度量值:

  • 精确度(Precision )表示已识别的正确分类的分数 。 例如,如果模型将 100 张图像识别为狗,实际上其中 99 张是狗,那么精确度为 99%。
  • 召回率(Recall )表示正确识别的实际分类的分数 。 例如,如果实际上有 100 张苹果的图像,并且该模型将 80 张标识为苹果,则召回率为 80%。

概率阈值

请注意 “性能” 选项卡左窗格上的 “概率阈值” 滑块 。这是预测被视为正确时所需具有的置信度(用于计算精度和召回率)。

当使用较高的概率阈值来解释预测调用时,它们往往会以高精度返回结果,但以召回率为代价 — 检测到的分类是正确的,但许多仍未被检测到。 使用较低的概率阈值则恰恰相反 — 实际分类会被检测到,但该集合内有更多误报。 考虑到这一点,应该根据项目的特定需求设置概率阈值。 稍后,在客户端接收预测结果时,应使用与此处所用概率阈值相同的概率阈值。

管理训练迭代

每次训练检测器时,都会创建一个新的迭代,其中包含其自身的已更新性能指标 。 可以在 “性能” 选项卡的左窗格中查看所有迭代 。在左侧窗格中,还可以找到 “删除” 按钮,如果迭代已过时,可以使用该按钮删除迭代 。 删除迭代时,会删除唯一与其关联的所有图像。

 

6. 测试训练结果,如果不满意,适当增加图片进行标记后训练

 

7. 发布

选择预测资源进行发布

8. 使用 Postman 测试

获取预测资源信息

在 Postman 中测试

 

 

检查预测结果

 

 
 
 

android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错

android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错

如何解决android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错?

我在使用Google视觉OCR库的android studio中遇到问题。 这是错误:

W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.dynamite.ocr not found.
I/DynamiteModule: Considering local module com.google.android.gms.vision.dynamite.ocr:0 and remote module com.google.android.gms.vision.dynamite.ocr:0
W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.ocr not found.
I/DynamiteModule: Considering local module com.google.android.gms.vision.ocr:0 and remote module com.google.android.gms.vision.ocr:0
E/Vision: Error loading optional module com.google.android.gms.vision.ocr: com.google.android.gms.dynamite.DynamiteModule$LoadingException: No acceptable module found. Local version is 0 and remote version is 0.

你能帮我吗?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

android – Google Vision API – 关于空对象引用的Face方法

android – Google Vision API – 关于空对象引用的Face方法

我正在尝试更改Google提供的示例应用,以便在Android上进行面部检测.

FaceDetector detector = new FaceDetector.Builder(getApplicationContext())
                .setTrackingEnabled(false)
                .setMode(FaceDetector.ACCURATE_MODE) // Accurate mode allows to get better face detection and better position (but the detection will be slower)
                .setLandmarkType(FaceDetector.ALL_LANDMARKS)
                .build();

        // This is a temporary workaround for a bug in the face detector with respect to operating
        // on very small images.  This will be fixed in a future release.  But in the near term, use
        // of the SafeFaceDetector class will patch the issue.
        Detector<Face> safeDetector = new SafeFaceDetector(detector);

        // Create a frame from the bitmap and run face detection on the frame.
        Bitmap bitmap = ((BitmapDrawable)ivPhoto.getDrawable()).getBitmap();
        Frame frame = new Frame.Builder().setBitmap(bitmap).build();
        SparseArray<Face> faces = safeDetector.detect(frame);

        if (!safeDetector.isOperational()) {
            Log.w(TAG, "Face detector dependencies are not yet available.");

            // Check for low storage.  If there is low storage, the library will not be
            // downloaded, so detection will not become operational.
            IntentFilter lowStorageFilter = new IntentFilter(Intent.ACTION_DEVICE_STORAGE_LOW);
            boolean hasLowStorage = registerReceiver(null, lowStorageFilter) != null;

            if (hasLowStorage) {
                Toast.makeText(this, R.string.low_storage_error, Toast.LENGTH_LONG).show();
                Log.w(TAG, getString(R.string.low_storage_error));
            }
        }

我的问题是,当我尝试在检测到的面上调用方法时,例如:

for(int i = 0; i < faces.size(); i++) {
            Face face = faces.get(i);    
            float x = face.getPosition().x + (face.getWidth() / 2);
            float y = face.getPosition().y + (face.getHeight() / 2);
}

然后有时我在应用程序崩溃时遇到此异常:

04-01 09:07:23.154 30199-30199/ch.epfl.proshare E/AndroidRuntime: FATAL EXCEPTION: main
                                                                  Process: ch.epfl.proshare, PID: 30199
                                                                  java.lang.RuntimeException: Unable to start activity ComponentInfo{ch.epfl.proshare/ch.epfl.proshare.main.MainActivity}: java.lang.NullPointerException: Attempt to invoke virtual method 'void android.support.v7.app.AppCompatActivity.setSupportActionBar(android.support.v7.widget.Toolbar)' on a null object reference
                                                                      at android.app.ActivityThread.performlaunchActivity(ActivityThread.java:2306)
                                                                      at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2366)
                                                                      at android.app.ActivityThread.access$800(ActivityThread.java:149)
                                                                      at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1284)
                                                                      at android.os.Handler.dispatchMessage(Handler.java:102)
                                                                      at android.os.Looper.loop(Looper.java:135)
                                                                      at android.app.ActivityThread.main(ActivityThread.java:5297)
                                                                      at java.lang.reflect.Method.invoke(Native Method)
                                                                      at java.lang.reflect.Method.invoke(Method.java:372)
                                                                      at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:908)
                                                                      at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:703)
                                                                   Caused by: java.lang.NullPointerException: Attempt to invoke virtual method 'void android.support.v7.app.AppCompatActivity.setSupportActionBar(android.support.v7.widget.Toolbar)' on a null object reference
                                                                      at ch.epfl.proshare.main.MainFragment.onCreateView(MainFragment.java:169)
                                                                      at android.support.v4.app.Fragment.performCreateView(Fragment.java:1962)
                                                                      at android.support.v4.app.FragmentManagerImpl.movetoState(FragmentManager.java:1067)
                                                                      at android.support.v4.app.FragmentManagerImpl.movetoState(FragmentManager.java:1248)
                                                                      at android.support.v4.app.FragmentManagerImpl.movetoState(FragmentManager.java:1230)
                                                                      at android.support.v4.app.FragmentManagerImpl.dispatchActivityCreated(FragmentManager.java:2042)
                                                                      at android.support.v4.app.FragmentController.dispatchActivityCreated(FragmentController.java:165)
                                                                      at android.support.v4.app.FragmentActivity.onStart(FragmentActivity.java:543)
                                                                      at android.app.Instrumentation.callActivityOnStart(Instrumentation.java:1220)
                                                                      at android.app.Activity.performStart(Activity.java:6036)
                                                                      at android.app.ActivityThread.performlaunchActivity(ActivityThread.java:2269)
                                                                      at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2366) 
                                                                      at android.app.ActivityThread.access$800(ActivityThread.java:149) 
                                                                      at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1284) 
                                                                      at android.os.Handler.dispatchMessage(Handler.java:102) 
                                                                      at android.os.Looper.loop(Looper.java:135) 
                                                                      at android.app.ActivityThread.main(ActivityThread.java:5297) 
                                                                      at java.lang.reflect.Method.invoke(Native Method) 
                                                                      at java.lang.reflect.Method.invoke(Method.java:372) 
                                                                      at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:908) 
                                                                      at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:703) 

我真的不明白为什么SafeDetector会返回带有空面的面的SparseArray.有谁遇到过这个问题?

解决方法:

我实际上刚刚找到了解决问题的方法.
面存储在SparseArray中,实际上类似于从整数(ids)到面的映射.因此,获得一张脸应该通过以下方式完成:

Face face = faces.valueAt(i);

代替

Face face = faces.get(i);

关于android-vision – 使用Google的移动视觉API只检测数字?的问题我们已经讲解完毕,感谢您的阅读,如果还想了解更多关于A Vision for Making Deep Learning Simple、AI 应用之 Custom Vision 检测是否戴口罩、android studio问题-E / Vision:加载可选模块com.google.android.gms.vision.ocr时出错、android – Google Vision API – 关于空对象引用的Face方法等相关内容,可以在本站寻找。

本文标签: