Sponsored Links

Rabu, 13 Juni 2018

Sponsored Links

By George Kour Supervised By: Prof. Dana Ron Dr. Raid Saabne - ppt ...
src: slideplayer.com

In statistics, the distance earth movers ( EMD ) is a measure of the distance between two probability distributions over an area D . In mathematics, this is known as the Wasserstein metric. Informally, if the distribution is defined as two different ways to accumulate some dirt in this region D , EMD is the minimum cost to convert one stack to another; where the cost is assumed to be the amount of dirt moved over the distance that was moved.

The above definition applies only if two distributions have the same integral (informally, if two piles have the same amount of dirt), as in a normalized histogram or probability density function. In this case, EMD is equivalent to 1st Mallows distance or 1 Wasserstein distance between two distributions.


Video Earth mover's distance



Theory

Asumsikan bahwa kita memliki serangkaian poin dalam                                        R                               d                            {\ textstyle \ mathbb {R} ^ {d}}  (dimension                    d             {\ textstyle d}   ). Alih-alih menetapkan satu distribusi that set poin, kita dapat mengelompokkannya dan mewakili titik yang ditetapkan dalam hal gugus. Denuncia demikian, setiap cluster adalah satu titik dalam                                        R                               d                            {\ textstyle \ mathbb {R} ^ {d}}   dan bobot gugus ditentukan oleh fraksi dari distribusi yang ada dalam kluster tersebut. Representasi distribusi oleh sekelompok gugus disebut tanda tangan . Second turn tangan dapat memilize ukuran yang berbeda, misalnya, distribusi bimodal memiliki tanda tangan yang lebih pendek (2 klaster) daripada yang rumit. Satu perwakilan klaster (mean atau mode dalam                                        R                               d                            {\ textstyle \ mathbb {R} ^ {d}}   ) dapat dianggap sebagai fitur tunggal dalam tanda tangan. Jarak antara masing-masing fitur disebut sebagai jarak darat .

EMD problems can be solved as transportation problems. Suppose that several suppliers, each with a certain amount of goods, are required to supply some consumers, each with limited capacity provided. For each supplier-consumer partner, the cost of transporting one unit of goods is given. Transportation problems then find the most inexpensive flow of goods from suppliers to consumers that meet consumer demand. Similarly, here the problem is changing one signature ( Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â P Â Â Â Â Â Â Â Â Â Â Â Â Â Â {\ textstyle P} Â Â ) to another ( Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Q Â Â Â Â Â Â Â Â Â Â Â Â Â Â {\ textstyle Q} Â Â ) with the minimum work done.

Kami ingin menemukan aliran                     F        =        [                f                      saya             ,            j                         ]             {\ textstyle F = [f_ {i, j}]}  , denote                            f                      saya             ,            j                              {\ textstyle f_ {i, j}}   aliran antara                            p                      saya                              {\ textstyle p_ {i}}  dan                            q                      j                              {\ textstyle q_ {j}}   , yang meminimalkan keseluruhan biaya.

                   min                              ?                          saya              =              1                                      m                                           ?                          j              =              1                                      n                                          f                          saya              ,              j                                          d                          saya              ,              j                                           {\ displaystyle \ min {\ sum _ {i = 1} ^ m} \ jumlah _ {j = 1} ^ n n f, j} d_ {i, j}}}  Â

Aliran optimal                         F                  {\ textstyle F}    ditemukan dengan menyelesaikan masalah optimasi linier ini. Jarak penggerak bumi didefinisikan sebagai pekerjaan yang dinormalkan oleh aliran total:

                        E          M          D          (          P         ,          Q         )          =                                                                 ?                                     saya                    =                    1                                                      m                                                                ?                                     j                    =                    1                                                      n                                                                 f                                     saya                   ,                    j                                                                 d                                     saya                   ,                    j                                                                                          ?                                     saya                    =                    1                                                      m                                                                ?                                     j                    =                    1                                                      n                                                                 f                                     saya                   ,                    j                                                                                   {\ displaystyle EMD (P, Q) = {\ frac {\ jumlah _ {i = 1} ^ {m} \ jumlah _ {j = 1} ^ {n} f_ {i, j} d_ {i, j}} {\ sum _ {i = 1} ^ {m} \ jumlah _ {j = 1} ^ {n} f_ {i, j}}}}   

Maps Earth mover's distance



Ekstensi

Some applications may require comparison of distribution with different mass masses. One approach is to allow partial matches, where the dirt from the largest distribution is reset to make the smallest, and any remaining "debris" is removed at no cost. Under this approach, EMD is no longer the real distribution distance.

Another approach is to allow mass to be created or destroyed, at a global and/or local level, as an alternative to transportation, but with a cost penalty. In this case one has to determine the real parameter ?, the ratio between the cost of creating or destroying a unit of "dirt", and the cost of transportation by unit distance. This is equivalent to minimizing the amount of earth moving costs plus? times the L1 distance between the reconstituted pile and the second distribution.

Notationally, jika                        ?         :          P          ->          Q                  {\ displaystyle \ pi: P \ to Q}    adalah fungsi parsial yang merupakan kumpulan pada subset                                    P           ?                  ?          P                  {\ displaystyle P '\ subset P}    dan                                    Q           ?                  ?          Q                  {\ displaystyle Q '\ subset Q}    , maka seseorang tertarik pada fungsi jarak

                                   |                   P          -          Q                                  |                                   ?                              =                     |                   P          \                     P           ?                              |                                       |                   Q          \                     Q           ?                              |                           {\ displaystyle | P-Q | _ {\ pi} = | P \ setminus P '| | Q \ setminus Q' |}   

di mana                    P         \                 P           ?                   {\ displaystyle P \ setminus P '}  menunjukkan set minus. Di sini,                             P           ?                   {\ displaystyle P '}   akan menjadi bagian dari bumi yang dipindahkan; jadi                    P         \                 P           ?                   {\ displaystyle P \ setminus P '}   akan menjadi bagian yang tidak dipindahkan, dan                               |                P         \                 P           ?                         |                     {\ displaystyle | P \ setminus P '|}   ukuran tumpukan tidak dipindahkan. Dengan simetri, seseorang merenungkan                               Q           ?               ?        Q             {\ displaystyle Q '\ subset Q}  sebagai tumpukan di tempat tujuan yang 'sampai di sana' dari P , dibandingkan dengan total Q bahwa kita ingin ada di sana . Secara formal, jarak ini menunjukkan seberapa banyak korespondensi suntik berbeda dari isomorfisma.

The Wasserstein Metric a.k.a Earth Mover's Distance: A Quick and ...
src: i.ytimg.com


Menghitung EMD

EMD can be calculated by solving transport instance problems, using any algorithm for minimum cost flow issues, e.g. simplex network algorithm.

Hungarian algorithms can be used to obtain solutions if the domain D is the set {0,1} . If the domain is an integral part, it can be translated to the same algorithm by representing the integral trash as some binary binary.

Sebagai kasus khusus, jika D adalah susunan satu dimensi "sampah", EMD dapat dihitung secara efisien dengan memindai larik dan mencatat berapa banyak kotoran yang perlu diangkut di antara tempat sampah berturut-turut:

                                                                                                                        EMD                                                            0                                                                                    =                  0                                                                                                                  EMD                                                            saya                                           1                                                                                    =                                     P                                         saya                                                                                                               EMD                                                            saya                                                      -                                     Q                                         saya                                                                                                                                 Total Jarak                                                                 =                 ?                                     |                                                                           EMD                                                            saya                                                                         |                                                                                   {\ displaystyle {\ begin {array} {rl} {\ text {EMD}} _ {0} & amp; = 0 \\ {\ text {EMD}} _ { i 1} & amp; = P_ {i} {\ text {EMD}} _ {i} -Q_ {i} \\ {\ text {Total Distance}} & amp; = \ sum | {\ text {EMD} } _ {i} | \ end {array}}}   

Improving Generative Adversarial Network (GAN) - ppt download
src: slideplayer.com


Analisis kesamaan berbasis EMD

EMD equality analysis (EMDSA) is an important and effective tool in many multimedia information retrieval and recognition applications. However, the cost of computing EMD is super-cubic for the number of "vats" given "D" arbitrarily. An efficient and scalable EMD calculation technique for large-scale data has been studied using MapReduce, as well as a synchronous mass parallel dataset and a powerful distribution.

11.3 Linear Programming | 11 Optimization | Pattern Recognition ...
src: i.ytimg.com


Apps

The initial application of EMD in computer science is comparing two grayscale images that may be different due to the dry, opaque, or local deformation. In this case, the region is the image domain, and the total amount of light (or ink) is "dirt" to be rearranged.

EMD is widely used in content-based shooting to calculate the distance between color histograms of two digital images. In this case, the region is a RGB color cube, and each pixel of the image is a "dirt" field. The same technique can be used for other quantitative pixel attributes, such as luminance, gradients, real movements in video frames, etc.

More generally, EMD is used in pattern recognition to compare generic summaries or substitutes for data records called signatures. A typical signature consists of a list of pairs ( x 1 , m 1 ),... ( x n , m n )), where each x i is a certain "feature" (for example, colors in images, letters in text, etc.), and m i is the "mass" (how many times the feature occurred in the recording). Alternatively, x i may be the center of the data cluster, and m i number of entities in that cluster. To compare the two signatures with the EMD, one must determine the distance between features, which is interpreted as the cost of converting mass units from one feature into another. The EMD between the two signatures is then the minimum cost for turning one of them into another.

Improving Generative Adversarial Network (GAN) - ppt download
src: slideplayer.com


History

This concept was first introduced by Gaspard Monge in 1781, and anchor the field of transportation theory. The use of EMD as a measure of distance for monochromatic images was described in 1989 by S. Peleg, M. Werman and H. Rom. The name of "long distance motion" was proposed by J. Stolfi in 1994, and was used in print in 1998 by Y. Rubner, C. Tomasi and L. G. Guibas.

small.jpg
src: www4.comp.polyu.edu.hk


References


Improving Generative Adversarial Network (GAN) - ppt download
src: slideplayer.com


External links

  • C code for Earth Movement
  • Python2 wrapper for C implementation of Earth Mover's Distance
  • C and Matlab and Java wrapper codes for Earth Mover's Distance, especially efficient for ground thresholded spacing
  • Java implementation of generic generator to evaluate equality analysis based on Earth Mover based on

Source of the article : Wikipedia

Comments
0 Comments