Uploaded by gmtaifur17

Bitwise Operations: Popcount & Bitsets in C++

advertisement
|
remon17 | Logout
HOME
ERRICHTO
TOP
BLOG
CATALOG
TEAMS
CONTESTS
SUBMISSIONS
GYM
GROUPS
PROBLEMSET
CONTESTS
GROUPS
PROBLEMSETTING
RATING
EDU
API
CALENDAR
HELP
STREAMS
Errichto's blog
→ Pay attention
Before contest
Codeforces Round 900 (Div. 3)
03:10:20
Bitwise operations 2 — popcount & bitsets
By Errichto, 4 years ago,
Part 1 (link) introduces basic bitwise operations. This is part 2 and it's mainly about (in)famous bitsets and
example problems. Also, see links to very useful advanced stuff at the bottom. EDIT: here's video version
of this blog (on my Youtube channel).
→ remon17
Rating: 944
Contribution: 0
Settings
Blog
Teams
Submissions
Talks
Contests
Built-in functions
In C++, __builtin_popcount(x) returns popcount of a number — the number of ones in the binary
representation of x. Use __builtin_popcountll(x) for long longs.
There are also __builtin_clz and __builtin_ctz (and their long long versions) for counting the
number of leading or trailing zeros in a positive number. Read more here. Now, try to solve these two
simple tasks in O(1), then open the spoiler to check the solution:
Compute the biggest power of 2 that is a divisor of x. Example: f(12) = 4
Compute the smallest power of 2 that is not smaller than x. Example: f(12) = 16
While popcount is often needed, I rarely use the other two functions. During a contest, I would solve the
two tasks above in O(log x) with simple while-loops, because it's easier and more intuitive for me. Just be
aware that these can be done in O(1), and use clz or ctz if you need to speed up your solution.
Motivation behind bitsets
Consider this problem: There are N ≤ 5000 workers. Each worker is available during some days of this
month (which has 30 days). For each worker, you are given a set of numbers, each from interval [1, 30],
representing his/her availability. You need to assign an important project to two workers but they will be
able to work on the project only when they are both available. Find two workers that are best for the job —
maximize the number of days when both these workers are available.
remon17
→ Top rated
#
User
Rating
1
tourist
3751
2
Benq
3727
3
cnnfls_csy
3691
4
Radewoosh
3651
5
jiangly
3632
6
orzdevinwang
3559
7
-0.5
3545
8
inaFSTream
3478
9
fantasy
3468
10
Rebelz
3415
Countries | Cities | Organizations
You can compute the intersection of two workers (two sets) in O(30) by using e.g. two pointers for two
sorted sequences. Doing that for every pair of workers is O(N 2 ⋅ 30), slightly too slow.
We can think about the availability of a worker as a binary string of length 30, which can be stored in a
single int . With this representation, we can count the intersection size in O(1) by using
__builtin_popcount(x[i] & x[j]) . The complexity becomes O(N 2 ), fast enough.
What if we are given the availability for the whole year or in general for D days? We can handle D
in a single unsigned long long , what about bigger D?
≤ 64
We can split D days into convenient parts of size 64 and store the availability of a single worker in an
array of D unsigned long longs. Then, the intersection can be computed in O( D ) and the whole
64
complexity is O(N 2
⋅
D .
)
64
64
code
So, we can simulate a long binary number with multiple unsigned long longs. The implementation isn't that
bad but doing binary shifts would be quite ugly. Turns out all of this can be done with bitsets easily.
Bitsets
bitset<365> is a binary number with 365 bits available, and it supports most of binary operations. The
code above changes into simple:
code
View all →
→ Top contributors
#
User
Contrib.
1
adamant
178
2
awoo
167
3
BledDest
165
4
Um_nik
164
5
maroonrk
163
6
SecondThread
160
7
nor
157
8
-is-this-fft-
154
9
kostka
146
10
Geothermal
143
View all →
→ Find user
Handle:
Find
Some functions differ, e.g. x.count() instead of __builtin_popcount(x) but it's only more
convenient. You can read and print binary numbers, construct a bitset from int or string bitset<100>
a(17); bitset<100> b("1010"); . You can even access particular bits with b[i] . Read more in
C++ reference https://en.cppreference.com/w/cpp/utility/bitset.
Note that the size of the bitset must be a constant number. You can't read n and then declare
→ Recent actions
DioHERO → Why so less "RED" coders in
India?
bitset<n> john; . If n is up to 100, just create bitset<100> .
AlphaMale06 → Codeforces Round #900
(Div. 3)
The complexity of bitwise operations is O( size ) or O( size ), it depends on the architecture of your
32
64
computer.
SecondThread → Meta Hacker Cup 2023
Practice Round
Sushil07 → Streak
Problems
JelalTkm → tourist's Birthday!
P1. Different numbers — You are given a sequence of N ≤ 107 numbers, each from interval [0, 109 ].
How many different values appear in the sequence? Don't use set or unordered_set because they
quite slow.
solution
windva → Codeforces Round 899 (Div. 2)
Editorial
BledDest → Educational Codeforces Round
155 — Editorial
windva → Codeforces Round 899 (Div. 2)
P2. Knapsack — You are given N ≤
with total weight exactly W ≤ 106 ?
solution
1000 items, each with some weight wi . Is there a subset of items
__fn__ → I wasn't lucky !!!
mePranav → The 75 Hard coding challenge!
P3. Triangles in a graph — Given a graph with n ≤ 2000 vertices and m ≤ n ⋅
count triples of vertices a, b, c such that there are edges a − b , a − c and b − c .
hint
(n − 1)/2 edges,
FedeNQ → Teams going to ICPC WF 2023
(Egypt 2023, 2nd final) — WIP List
mesanu → Codeforces Round #898 (Div. 4)
Editorial
P4. Chef and Queries — https://www.codechef.com/problems/CHEFQUE (easy)
CheaterExposer → Codeforces Cheater IOI
Medalist
P5. Odd Topic — https://www.codechef.com/AABH2020/problems/ODTPIC (medium), thanks to Not-Afraid
for the suggestion
jjang36524 → Tutorial and tips to reach
Expert.
P6. Funny Gnomes — https://www.codechef.com/problems/SHAIKHGN (hard)
NasorHidar → Unveiling the Coding
Renaissance: A Glimpse into Bangladesh's
Coding Culture
Bonuses
SecondThread → It's Happening! Meta
Hacker Cup 2023 Schedule
1) m & (m‐1) turns off the lowest bit that was set to 1 in a positive number m. For example, we get 24
for m = 26, as 11010 changes into 11000. Explanation on quora
2) A quite similar trick allows us to iterate efficiently over all submasks of a mask, article on cp-algorithms /
e-maxx. This article also explains why masks-submasks iteration is O(3n ).
3) DP on broken profile (grid dp) — https://cp-algorithms.com/dynamic_programming/profile-dynamics.html
4) SOS dp (sum over subset) — https://codeforces.com/blog/entry/45223 &
https://www.youtube.com/watch?v=Lpvsd8WpbWc&t=5m4s
5) _Find_next function and complexity notation for bitsets — https://codeforces.com/blog/entry/43718
I will add links to some problems in online judges, feel free to suggest some in the comments. I think that
bonuses 3 and 4 lack some explanation with drawings, maybe I will make some soon.
stefdasca → Click here if you want to know
your future CF rating
awoo → Educational Codeforces Round 92
Editorial
awoo → Educational Codeforces Round 67
Editorial
awoo → Educational Codeforces Round 155
[Rated for Div. 2]
Parisa_Amiri → Harbour.Space Scholarship
Contest 2023-2024 (Div. 1 + Div. 2)
bitwise, bits, bitset
Errichto
+326
Candidate_Carin → Mistakes in CP nobody
talks about
4 years ago
55
MikeMirzayanov → Rule about third-party
code is changing
WRKRW → Should it be RE?
nor → PSA: Increase your stack size before
the Meta Hacker Cup, here's how
Comments (49)
Show archived
|
Write comment?
Detailed →
4 years ago, # |
+12
This problem from Codechef is a perfect example where bitsets comes handy.
→ Reply
Not-Afraid
4 years ago, # |
+9
Would you solve a couple of dp_bitmask problems in yournext post?
→ Reply
cfmaster
4 years ago, # |
+3
amazing work
→ Reply
disabled_Account
4 years ago, # |
← Rev. 2
It's worth noting that after adding #pragma GCC target("popcnt")
__builtin_popcount() is replaced to corresponding machine instruction (look at the
difference). In my test this maked x2 speed up.
+27
lemelisk
bitset::count() use __builtin_popcount() call in implementation, so it's also
affected by this.
→ Reply
3 years ago, # ^ |
0
I would like to see your benchmarks. When I made my benchmarks, I noticed about a
6.2% speedup. Not as drastic as you said.
Qualified
→ Reply
new, 2 weeks ago, # ^ |
0
I would like too, can you show us your bench?
→ Reply
2147483648
new, 4 days ago, # ^ |
0
https://cses.fi/problemset/task/2137/
The problem get submitted after adding #pragma GCC target("popcnt")
#include <bits/stdc++.h>
#pragma GCC target("popcnt")
using namespace std;
#define int long long
int mod = 1e9 + 7;
signed main(void) {
int n;
cin >> n;
vector<vector<int>> grid(n, vector<int>(n, 0));
for (int i = 0; i < n; i++) {
string s;
cin >> s;
for (int j = 0; j < n; j++) {
grid[i][j] = s[j] ‐ '0';
}
rambojack002
}
vector<bitset<3000>> bit(n);
for (int j = 0; j < n; j++) {
for (int i = 0; i < n; i++) {
bit[j][i] = grid[i][j];
}
}
int ans = 0;
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
int cnt = (bit[i] & bit[j]).count();
ans += cnt * (cnt ‐ 1) / 2;
}
}
cout << ans << endl;
return 0;
}
→ Reply
4 years ago, # |
0
this problem on codechef and see comments in editorial some solved using bitset to speed up
the brute force solution
navneet.h
→ Reply
4 years ago, # |
0
Another problem where bitsets come handy.
→ Reply
abhi2402
4 years ago, # |
What is the difference between O(1) and O(30) ? How is using bitsets 'true' constant time?
→ Reply
zxcv890
0
4 years ago, # ^ |
0
Good question. You can think about the number of bits in architecture (usually 32 or
64 ) as a variable and then we're talking about O(n) vs. O( n ) . You come up with an
b
algorithm and it will have speed dependent on the machine it will be run on.
Errichto
→ Reply
4 years ago, # |
← Rev. 3
0
Thanks for the informative tutorial. The following is a C++14 demo program on using the 32/64-bit
popcount/clz/ctz bit-counting built-in functions in C++. The templates included in this demo may
help beginners to use these functions without worrying about memorizing their slightly
inconvenient names.
CodingKnight
demo
→ Reply
4 years ago, # |
← Rev. 2
0
Hi Errichto, I had asked you during a recent stream whether it is possible to do an iterative
version of going through all bitmasks of length N in O(2N ) time, instead of O(N ∗ 2N ) . We
can do it in recursion, by passing what we need to keep track of as an argument in the recursive
call. You had suggested that it is possible via some DP and bit tricks. So I thought a little bit,
and the trick used in Fenwick Tree came to mind. I have written a simple code, that for each
mask will store all the bits that are on in it. This is the link.
I had a question regarding this, we can get the value of the last set bit using &, but we need to
find the position of the bit, i.e. if last set bit value is 100 then position is 2 . To find position, we
will still have to iterate over the length of the bitmask right? Does the __builtin_ctz function take
1 operation or length of bits number of operations? What about the bitwise & and | operators?
Ofcourse maximum operations ( that we ever do in CP, mostly ) will only have 64 bit numbers, so
you could say O(64) = O(1), but I want to know about the constants exactly.
gupta_samarth
→ Reply
4 years ago, # ^ |
0
You can use some magic like
https://www.chessprogramming.org/BitScan#With_isolated_LS1B to get the position
instead of the value. Don't ask me how it works, idk about that but it works. You can
also do it in O(logBITS) with binary search (as mentioned in the link I gave too under
divide and conquer).
tfg
Also maybe more important to know, most often optimizing these O(BITS) that you
loop over bits is overkill and not necessary because you usually do some really light
operations also not involving cache misses so it's actually faster than you'd expect.
→ Reply
4 years ago, # ^ |
0
Wow, the divide and conquer method was mind blowing. Another question I
have always had in my mind was, when you do a&b , you must make
log(max(a, b)) operations right? So does that make the first part of finding
the last bit value itself useless? Also, does this mean Fenwick Tree has
log2 (N) update instead of log(N) ?
gupta_samarth
→ Reply
4 years ago, # ^ |
tfg
0
No, a&b should be done in few cycles and should actually be
cheaper than a+b because the ALU in cpus for sure have a bitwise
and operation and it doesn't need to carry over the carry bit so it's
fully parallel for the bits.
→ Reply
4 years ago, # ^ |
0
Oh yeah, makes sense. ( Le me: Recalls Computer
Architecture course, wonders what use is university )
→ Reply
gupta_samarth
3 years ago, # ^ |
Ritwin
0
That makes sense to me, because computers
probably have built-in circuits in the chips for
bitwise operations, where you plug in 2
numbers from one end, some magic happens in
constant time, and you get the output.
→ Reply
4 years ago, # |
+4
Update :)
I made two Youtube videos (part 1 and part2) which cover the same content as my blogs.
Errichto
And added bonus (5) with a function to find the next bit 1 in a bitset, _Find_next(i) .
→ Reply
3 years ago, # ^ |
0
I have a question
Can I implement find_next_bit() manually under O(log n) ?
SPyofgame
→ Reply
3 years ago, # ^ |
+1
Shift and then https://www.geeksforgeeks.org/position-of-rightmost-set-bit/
→ Reply
Errichto
3 years ago, # ^ |
← Rev. 4
0
IGMaster ! I have 2 questions
Is it danger if I use signed variable ?
But if I want to know next set bit of (n) starting from (p). I still
need to know the variable (n) which is from position (p) — 1 ->
0 right ?
SPyofgame
→ Reply
3 years ago, # ^ |
+1
I don't know. Google: c++ signed variable bitwise
operations
yeah, you need to shift first.
→ Reply
Errichto
3 years ago, # ^ |
← Rev. 2
0
← Rev. 3
0
Thanks IGMaster
→ Reply
SPyofgame
3 years ago, # ^ |
IGMaster ! Can I find next unset bit without reversing it ?
I edited: Master -> IGMaster
SPyofgame
→ Reply
3 years ago, # ^ |
+9
Stop calling me master. I would negate a number first (flip every bit) but for
sure something equivalent can be done. I think you're focusing too much on
this. When I had your rating, I didn't know any of this.
→ Reply
Errichto
3 years ago, # ^ |
0
Thanks and sorry for calling like that.
I just curious to know if there exists a simpler & effeciency bitwise
implementation
SPyofgame
→ Reply
3 years ago, # ^ |
Errichto: Stop calling me master.
SPyofgame: I edited: Master -> IGMaster
Swistakk
XDDDDDDDDDDDDDDDDDDDDDDDDDD
+28
→ Reply
4 years ago, # |
← Rev. 2
+2
Edit: This should be deleted, I was just toasted and put 1LL << 31 instead of 1LL << 32.
... ... ...
Hey people, I need some help.
For chef and queries, I have the current si updating like this after each iteration:
s = ((a*s+b)% MAX_W);
where MAX_W is 1LL << 31;
YoshiYoshiYoshi
the type of a, s and b is long long.
I don't get why this messes up the algorithm. If I change their types to unsigned int and remove
the % operation in the update, the algorithm works fine. However, I think that the first approach
should be working fine, as it is how the problem says we need to calculate si.
Can someone please guide me through the reasoning behind the issue?
Thanks!
→ Reply
2 years ago, # ^ |
0
What was size of your bitset??
I am taking bitset<1000000001> and it shows stack-overflow
while taking bitset<100000> on smaller inputs works fine
→ Reply
adil198
2 years ago, # ^ |
0
bitset<1000000001> takes about 125 MB of memory. You are (generally)
not expected to allocate so much on the stack. Use the heap or a global
variable instead.
→ Reply
imachug
2 years ago, # ^ |
0
can you please elaboarte more on a code?
P4 of the above post.
As in the bitset video by Errichto, he used a bitset of that size.
adil198
→ Reply
4 years ago, # |
← Rev. 2
0
Please check these 2 codes:
Problem:- https://codeforces.com/problemset/problem/550/B WA:- https://ideone.com/0eCeTX
AC:- https://codeforces.com/contest/550/submission/11417471
totally_banned
The only difference is, the AC code has 0 based array, and mine has 1 based array. Is it wrong
to make 1 based array for bitmasking?
→ Reply
4 years ago, # ^ |
+1
Then you need a bitmask with bits on positions 1 through N, so an even number up to
2^(n+1). I don't think you understand what your code does.
→ Reply
Errichto
3 years ago, # |
← Rev. 2
0
Errichto What is the complexity of __builtin_popcount(N) ? (Assuming N can be as
arbitrarily large as possible)
→ Reply
Robur_131
3 years ago, # ^ |
It will take O(32)
→ Reply
Aimless_Bot
← Rev. 2
0
3 years ago, # ^ |
0
So in other words it's just O(number of bits) ? I saw this comment
claiming that's it's O(log (number of bits))
→ Reply
Robur_131
3 years ago, # ^ |
Aimless_Bot
← Rev. 2
0
If it is so, I don't know why it is and what exactly is working at
architecture or compiler level. If anyone replies at your comment
plz let me know as I also what to know how it can do in log(bits)
time.
→ Reply
3 years ago, # ^ |
← Rev. 2
+3
Here you have how to do it in O(log bits) .
mnaeraxr
Technically speaking it is only considered O(log bits)
because it assumes adding and shifting numbers can be
done in constant time, so it does O(log bits) such
operations. This is a reasonable assumption in this
context since we are handling only with numbers of 32
and 64 bits.
→ Reply
3 years ago, # ^ |
← Rev. 2
0
Treat int popcount as an O(1) operation but it's just slightly slower than simpler
operations like xor. The complexity becomes O(L/32) for bitsets of length L .
→ Reply
Errichto
13 months ago, # |
0
My program crashes when I try to create bitset of size 10^8 or 10^9 any idea on why this would
happen.can anyone please explain this.
→ Reply
Mohsina_Shaikh
7 months ago, # ^ |
0
if you create 10^8 bitset you will exceed the memory limit.
→ Reply
KKK
6 months ago, # |
0
https://www.codechef.com/problems/FUNARR
this problem can be done for bitset practice.
parag776
→ Reply
6 months ago, # |
2021zll
← Rev. 2
-8
I recommend a problem from THUPC 2021: P7606 [THUPC2021] 混乱邪恶 — 洛谷 | 计算机科学
教育新生态. It's a problem which uses std::bitset or unsigned long long with bit
shifts to represent and calculate DP states. But I'm sorry that this problem only provides the
Chinese version.
→ Reply
4 months ago, # |
0
Compute the smallest power of 2 that is not smaller than x. Example: f(12) = 16 In this question
if i use 1 << (32 — __builtin_clz(x)) doesn't it produce the same result. Can anyone answer
please.
The_Underdog.26
→ Reply
new, 7 weeks ago, # |
0
bitset<1000000001> bs; it is showing segmentation fault when i am declaring inside function.But
it is fine if i declare it globally.can anyone explaint why is this happening and other restrictions in
bitset declarations ?
d_sqaure
→ Reply
new, 7 weeks ago, # ^ |
0
Local variables are placed in a stack. This bitset is equivalent to ~125MB, this is a lot
bigger than you local stack size (don't remember exactly what is default stack size). If
you will try submitting it to Codeforces, you won't get any errors.
Wind_Eagle
Increase stack size using compiler flags.
→ Reply
new, 7 weeks ago, # ^ |
← Rev. 2
0
Thanks!.And if i declare vector< bool > v(10000000001); of same size
locally.It is working fine, what is the reason here ?. Also, if you can share the
way to customise stack size using compilers flag will be helpful.
d_sqaure
→ Reply
new, 13 days ago, # |
← Rev. 6
0
I was struggling to figure out the first problem which was posted in this post. After watching the
video for endless times I was missing something, and could not code this problem. Hence, I
would like to get some suggestion whether the below code can be improved? Please
assist .
There are N ≤ 5000 workers. Each worker is available during some days of this month (which
has 30 days). For each worker, you are given a set of numbers, each from interval [1, 30] ,
representing his/her availability. You need to assign an important project to two workers but they
will be able to work on the project only when they are both available. Find two workers that are
best for the job — maximize the number of days when both these workers are available.
Test Case:
liril_uri
Input: workers
= [[2, 3, 5, 6, 8], [2, 4, 5, 8], [1, 2, 10, 12, 14, 16]];
output: 3, pairs: 0 and 1
Unable to parse markup [type=CF_MATHJAX]
My Code
Using bitset
My Code
→ Reply
Codeforces (c) Copyright 2010-2023 Mike Mirzayanov
The only programming contests Web 2.0 platform
Server time: Sep/26/2023 17:24:09UTC+6 (l1).
Desktop version, switch to mobile version.
Privacy Policy
Supported by
Download