赞
踩
提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
设真实标签有m类,聚类结果类别数目和真实标签类别数目一样,也是m类。但是真实标签与预测标签类标可能不同,比如真实标签是[1,1,2,2,3,3],预测标签是[3,3,1,1,2,2],那就是说真实标签1对应预测标签3,2对应1,3对应1。那么如何实现真实标签和聚类结果标签的映射呢?
- %真实标签:La1 聚类结果标签:La2 映射后的标签:NewLabel
- Label1=unique(La1');
- L1=length(Label1);
- Label2=unique(La2');
- L2=length(Label2);
-
- ncls=L2;
- label=zeros(1,ncls);
-
- for k=1:L2
- index=find(La2==k);
- tmp=zeros(1,ncls);
- for j=1:ncls
- tmp(j)=sum(La1(index)==j);
- end
- [~,l]=max(tmp);
- label(k)=l;
- end
- NewLabel=label(La2);
这种方法就是我们直接找出对应每个预测标签,每一行中重复度最大的值,确定它的位置就可以了,但是这样的话会出现多个不同行的重复度最大的值在同一列的情况(即真实标签和聚类结果标签的映射不是1对1),这显然是不合理的。
- %真实标签:La1 聚类结果标签:La2 映射后的标签:NewLabel
-
- Label1=unique(La1');
- L1=length(Label1);
- Label2=unique(La2');
- L2=length(Label2);
- %构建计算两种分类标签重复度的矩阵G
- G = zeros(max(L1,L2),max(L1,L2));
- for i=1:L1
- index1= La1==Label1(1,i);
- for j=1:L2
- index2= La2==Label2(1,j);
- G(i,j)=sum(index1.*index2);
- end
- end
- %利用匈牙利算法计算出映射重排后的矩阵
- [index]=munkres(-G);
- %将映射重排结果转换为一个存储有映射重排后标签顺序的行向量
- [temp]=MarkReplace(index);
- %生成映射重排后的标签NewLabel
- NewLabel=zeros(size(La2));
- for i=1:L2
- NewLabel(La2==Label2(i))=temp(i);
- end
-
- end
- function [assignment] = munkres(costMat)
- % MUNKRES Munkres Assign Algorithm
- %
- % [ASSIGN,COST] = munkres(COSTMAT) returns the optimal assignment in ASSIGN
- % with the minimum COST based on the assignment problem represented by the
- % COSTMAT, where the (i,j)th element represents the cost to assign the jth
- % job to the ith worker.
- %
-
- % This is vectorized implementation of the algorithm. It is the fastest
- % among all Matlab implementations of the algorithm.
-
- % Examples
- % Example 1: a 5 x 5 example
- %{
- [assignment,cost] = munkres(magic(5));
- [assignedrows,dum]=find(assignment);
- disp(assignedrows'); % 3 2 1 5 4
- disp(cost); %15
- %}
- % Example 2: 400 x 400 random data
- %{
- n=5;
- A=rand(n);
- tic
- [a,b]=munkres(A);
- toc
- %}
-
- % Reference:
- % "Munkres' Assignment Algorithm, Modified for Rectangular Matrices",
- % http://csclab.murraystate.edu/bob.pilgrim/445/munkres.html
-
- % version 1.0 by Yi Cao at Cranfield University on 17th June 2008
-
- assignment = false(size(costMat));
-
- costMat(costMat~=costMat)=Inf;
- validMat = costMat<Inf;
- validCol = any(validMat);
- validRow = any(validMat,2);
-
- nRows = sum(validRow);
- nCols = sum(validCol);
- n = max(nRows,nCols);
- if ~n
- return
- end
-
- dMat = zeros(n);
- dMat(1:nRows,1:nCols) = costMat(validRow,validCol);
-
- %*************************************************
- % Munkres' Assignment Algorithm starts here
- %*************************************************
-
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- % STEP 1: Subtract the row minimum from each row.
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- dMat = bsxfun(@minus, dMat, min(dMat,[],2));
-
- %**************************************************************************
- % STEP 2: Find a zero of dMat. If there are no starred zeros in its
- % column or row start the zero. Repeat for each zero
- %**************************************************************************
- zP = ~dMat;
- starZ = false(n);
- while any(zP(:))
- [r,c]=find(zP,1);
- starZ(r,c)=true;
- zP(r,:)=false;
- zP(:,c)=false;
- end
-
- while 1
- %**************************************************************************
- % STEP 3: Cover each column with a starred zero. If all the columns are
- % covered then the matching is maximum
- %**************************************************************************
- primeZ = false(n);
- coverColumn = any(starZ);
- if ~any(~coverColumn)
- break
- end
- coverRow = false(n,1);
- while 1
- %**************************************************************************
- % STEP 4: Find a noncovered zero and prime it. If there is no starred
- % zero in the row containing this primed zero, Go to Step 5.
- % Otherwise, cover this row and uncover the column containing
- % the starred zero. Continue in this manner until there are no
- % uncovered zeros left. Save the smallest uncovered value and
- % Go to Step 6.
- %**************************************************************************
- zP(:) = false;
- zP(~coverRow,~coverColumn) = ~dMat(~coverRow,~coverColumn);
- Step = 6;
- while any(any(zP(~coverRow,~coverColumn)))
- [uZr,uZc] = find(zP,1);
- primeZ(uZr,uZc) = true;
- stz = starZ(uZr,:);
- if ~any(stz)
- Step = 5;
- break;
- end
- coverRow(uZr) = true;
- coverColumn(stz) = false;
- zP(uZr,:) = false;
- zP(~coverRow,stz) = ~dMat(~coverRow,stz);
- end
- if Step == 6
- % *************************************************************************
- % STEP 6: Add the minimum uncovered value to every element of each covered
- % row, and subtract it from every element of each uncovered column.
- % Return to Step 4 without altering any stars, primes, or covered lines.
- %**************************************************************************
- M=dMat(~coverRow,~coverColumn);
- minval=min(min(M));
- if minval==inf
- return
- end
- dMat(coverRow,coverColumn)=dMat(coverRow,coverColumn)+minval;
- dMat(~coverRow,~coverColumn)=M-minval;
- else
- break
- end
- end
- %**************************************************************************
- % STEP 5:
- % Construct a series of alternating primed and starred zeros as
- % follows:
- % Let Z0 represent the uncovered primed zero found in Step 4.
- % Let Z1 denote the starred zero in the column of Z0 (if any).
- % Let Z2 denote the primed zero in the row of Z1 (there will always
- % be one). Continue until the series terminates at a primed zero
- % that has no starred zero in its column. Unstar each starred
- % zero of the series, star each primed zero of the series, erase
- % all primes and uncover every line in the matrix. Return to Step 3.
- %**************************************************************************
- rowZ1 = starZ(:,uZc);
- starZ(uZr,uZc)=true;
- while any(rowZ1)
- starZ(rowZ1,uZc)=false;
- uZc = primeZ(rowZ1,:);
- uZr = rowZ1;
- rowZ1 = starZ(:,uZc);
- starZ(uZr,uZc)=true;
- end
- end
- %生成标签矩阵
- assignment(validRow,validCol) = starZ(1:nRows,1:nCols);
-
- %解决标签映射问题不需要计算权重cost,故将其注释
- %cost = 0;
- %cost = sum(costMat(assignment));
- %将存储标签顺序的空间矩阵转换为一个行向量
- function [assignment] = MarkReplace(MarkMat)
-
- [rows,cols]=size(MarkMat);
-
- assignment=zeros(1,cols);
-
- for i=1:rows
- for j=1:cols
- if MarkMat(i,j)==1
- assignment(1,j)=i;
- end
- end
- end
-
- end
显然,使用第二种方法更好。但是,当预测类标分类数大于实际类标分类数,比如,实际类标10类,预测类标15类,就无法使用匈牙利算法。因为匈牙利算法实际上是一种指派问题,只适合于一对一的指派
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。