查找给定数字组中的数字频率

假设我们在C ++中有一个向量/数组，我们希望计算这N个元素中的哪一个具有最大重复出现次数并输出最高计数。哪种算法最适合这项工作。

例：

int a = { 2, 456, 34, 3456, 2, 435, 2, 456, 2}

输出为4，因为2次出现4次。这是2次发生的最大次数。

对数组进行排序，然后快速传递以计算每个数字。该算法具有O（N * logN）复杂度。

或者，使用数字作为键创建哈希表。在哈希表中存储您键入的每个元素的计数器。你将能够在一次通过中计算所有元素; 但是，算法的复杂性现在取决于哈希函数的复杂性。

优化空间：

Quicksort（例如）然后遍历项目，仅跟踪最大计数。最好是O（N log N）。

优化速度：

迭代所有元素，跟踪单独的计数。该算法将始终为O（n）。

如果您有RAM并且您的值不是太大，请使用计数排序。

使用STL的可能的C ++实现可能是：

 #include  #include  #include  // functor struct maxoccur { int _M_val; int _M_rep; maxoccur() : _M_val(0), _M_rep(0) {} void operator()(const std::pair &e) { std::cout << "pair: " << e.first << " " << e.second << std::endl; if ( _M_rep < e.second ) { _M_val = e.first; _M_rep = e.second; } } }; int main(int argc, char *argv[]) { int a[] = {2,456,34,3456,2,435,2,456,2}; std::map m; // load the map for(unsigned int i=0; i< sizeof(a)/sizeof(a[0]); i++) m [a[i]]++; // find the max occurence... maxoccur ret = std::for_each(m.begin(), m.end(), maxoccur()); std::cout << "value:" << ret._M_val << " max repetition:" << ret._M_rep << std::endl; return 0; }

一点伪代码：

 //split string into array firts strsplit(numbers) //PHP function name to split a string into it's components i=0 while( i < count(array)) { if(isset(list[array[i]])) { list[array[i]]['count'] = list + 1 } else { list[i]['count'] = 1 list[i]['number'] } i=i+1 } usort(list) //usort is a php function that sorts an array by its value not its key, Im assuming that you have something in c++ that does this print list[0]['number'] //Should contain the most used number

哈希算法（构建计数[i] =基本线性时间内的#occurrences（i））非常实用，但理论上不严格O（n），因为在该过程中可能存在哈希冲突。

这个问题的一个有趣的特例是多数算法，如果存在任何这样的元素，你想要找到至少有n / 2个数组条目的元素。

这是一个快速解释，以及如何在线性时间内执行此操作的更详细说明，没有任何类型的散列诡计。

如果元素的范围与元素的数量相比较大，我会像其他人所说的那样进行排序和扫描。这是时间n * log n并且没有额外的空间（可能是额外的log n）。

计数排序的问题在于，如果值的范围很大，则初始化计数数组可能比排序要花费更多时间。

这是我完整的，经过测试的版本，使用了std::tr1::unordered_map 。

我把它做成大约O（n）。首先，它遍历n个输入值以插入/更新unordered_map的计数，然后它执行partial_sort_copy ，即O（n）。 2 * O（n）〜= O（n）。

 #include  #include  #include  #include  namespace { // Only used in most_frequent but can't be a local class because of the member template struct second_greater { // Need to compare two (slightly) different types of pairs template  bool operator() (const PairA& a, const PairB& b) const { return a.second > b.second; } }; } template  std::pair::value_type, unsigned int> most_frequent(Iter begin, Iter end) { typedef typename std::iterator_traits::value_type value_type; typedef std::pair result_type; std::tr1::unordered_map counts; for(; begin != end; ++begin) // This is safe because new entries in the map are defined to be initialized to 0 for // built-in numeric types - no need to initialize them first ++ counts[*begin]; // Only need the top one at this point (could easily expand to top-n) std::vector top(1); std::partial_sort_copy(counts.begin(), counts.end(), top.begin(), top.end(), second_greater()); return top.front(); } int main(int argc, char* argv[]) { int a[] = { 2, 456, 34, 3456, 2, 435, 2, 456, 2 }; std::pair m = most_frequent(a, a + (sizeof(a) / sizeof(a[0]))); std::cout << "most common = " << m.first << " (" << m.second << " instances)" << std::endl; assert(m.first == 2); assert(m.second == 4); return 0; }

它将在O（n）…………但事情是大的没有。数组可以采用相同大小的另一个数组…………

对于（I = 0;我

擦伤=计数[O]; 指数= O;

对于（I = 0;我

然后输出将是………元素索引出现 最大值 。在这个arrays中的次数……..

这里a []是我们需要搜索某些no的最大出现的数据数组。在arrays中…….

count []具有每个元素的计数……….注意：我们知道数据的范围将在数组中…例如。该数组中的数据范围从1到100 …….然后有100个元素的计数数组来跟踪，如果它的出现增加了索引值一个……..

查找给定数字组中的数字频率

我能正确理解getopt（）吗？

在Java中使用C结构

用Xlib捕获鼠标

从Julia调用C函数并将2D数组作为指针作为参数传递

为什么数据类型的大小会随着操作系统的变化而变化？

如何使用fork（）和exec（）在Mac OS上创建进程

opencv：将Scalar转换为float或double类型

使用UML进行C编程

赋值使得指针来自int w / out of cast

int和float的大小