Just as in Segmentation, can different addresses in Paging also point to the same physical memory location? - pointers

Eg. for segmentation, 0000:FFFF is equivalent to 0001:FFEF (Just a hypothetical case, don't know if we really use these in programming or if these are reserved spaces)
(I am new to assembly programming. Specifically x86.)

Yes, this is allowed. In fact, not only is this legal, it's also frequently used for a feature known as shared memory.

0000:FFFF is equivalent to 0001:FFEF only in real mode, VM86 mode, or SMM mode. In these modes, by definition, paging is not enabled. In protected mode1 without paging, they are necessarily translated to different physical addresses because the segment offsets are different (FFFF vs. FFEF) but the segment base address is the same2. With paging, when the segment offsets get added to the segment base address (which could be zero), they may either point to the same virtual page or different virtual pages, but either way, the the least 12 significant bits of the page offsets would be different (because the least 12 significant bits of the segment offsets are different) and so they cannot be equivalent irrespective of how the page tables are set up.
In general, different addresses may translate to the same physical address. When the page offsets are different but the least significant 12 bits are same nonetheless, it is possible for the logical addresses to get translated to the same physical addresses when they point to pages of different sizes. Otherwise, if at least one of the least significant 12 bits is different in the virtual addresses, they cannot be equal in the physical address space.
In protected mode, the segment selector 0000'0000'0000'00XXb is used as null segment selector and cannot be accessed. But Let's assume for the sake of argument that it is accessible (or consider 0000'0000'0000'0100b vs. 0000'0000'0000'0101b instead).
They refer to the same segment because the segment selector indices (the most significant 13 bits of each selector) and table indicators (the third least signficant bit) are equal

Related

Is there a provably optimal block (piece) size for torrents (and individual files)?

The BitTorrent protocol doesn't specify block (piece) size. This is left to the user. (I've seen different torrents for the same content with 3 or more different choices.)
I'm thinking of filing a BitTorrent Enhancement Proposal which needs to make a specific block size mandatory — both for the whole torrent, and also for individual files (for which BTv2 (BEP 52) specifies bs=16KiB).
The only thing I've found that's close is the rsync block size algorithm in Tridgell & Mackerras' technical paper. Their bs=300-1100 B (# bytes aren't powers of 2).
Torrents, however, usually use bs=64kB–16MB (# bytes are powers of 2, and much larger than rsync's) for the whole torrent (and, for BTv2, 16KiB for files).
The specified block size doesn't need to be a constant. It could be a function of thing-hashed size, of course (like it is in rsync). It could also be a function of file type; e.g. there might be some block sizes which are better for making partial video/archive/etc files more usable.
See also this analysis of BitTorrent as a block-aligned file system.
So…
What are optimal block sizes for a torrent, generic file, or partial usefulness of specific file types?
Where did the 16KiB bs in BEP 52 come from?
Block and piece size are not the same thing.
A piece is the unit that is hashed into the pieces string in v1 torrents, one hash per piece.
A block is a part of a piece that is requested via request (ID 6) and delivered via piece (ID 7) messages. These messages basically consist of (piece number, offset, length) tuple where the length is the block size. In this sense blocks are very ephemeral constructs in v1 torrents but they are still important since downloading clients have to keep a lot of state about them in memory. Since the downloading client is in control of the request size they customarily use fixed 16KiB blocks, even though they could do this more flexibly. For an uploading client it does not really matter complexity-wise as they have to simply serve the bytes covered by (piece,offset,length) and keep no further state.
Since clients generally implement an upper message size limit to avoid DoS attacks 16KiB is also the recommended upper bound. Specialized implementations could use larger blocks, but for public torrents that doesn't really happen.
For v2 torrents the picture changes a bit. There now are three concepts
the ephemeral blocks sent via messages
the pieces (now representing some layer in the merkle tree), needed for v1 compatibility in hybrid torrents and also stored as piece layers outside the info dictionary to allow partial file resume
the leaf blocks of the merkle tree
The first type is essentially unchanged compared to v1 torrents but the incentive to use 16KiB-sized blocks is much stronger now because that is also the leaf hash size.
The piece size must now be a power of two and multiple of 16KiB, this constraint did not exist in v1 torrents.
The leaf block size is fixed to 16KiB, it is relevant when constructing the merkle tree and exchanging message IDs 21 (hash request) and 22 (hashes)
What are optimal block sizes for a torrent, generic file, or partial usefulness of specific file types?
For a v1 torrent the piece size combined with the file sizes determines a lower bound of the metadata (aka .torrent file) size. Each piece must be stored as a 20byte hash in pieces, thus larger pieces result in fewer hashes and smaller .torrent files. For terabyte-scale torrents a 16KiB piece size result in a ~1GB torrent file, which is unacceptable for most use-cases.
For a v2 torrent it would result in a similarly sized piece layers in the root dictionary. Or if a client does not have the piece layers data available (e.g. because they started a download via infohash) they will have to retrieve the data via hash request messages instead, ultimately resulting in the same overhead, albeit more spread out over the course of the download.
Where did the 16KiB bs in BEP 52 come from?
16KiB was already the de-facto block size for most clients. Since a merkle-tree must be calculated from some leaf hashes a fixed block size for those leaves had to be defined. Hence the established messaging block size was also chosen for the merkle tree blocks.
The only thing I've found that's close is the rsync block size algorithm in Tridgell & Mackerras' technical paper. Their bs=300-1100 B (# bytes aren't powers of 2).
rsync uses a rolling hash for content-aware chunking rather than fixed-size blocks and that is the primary driver for their chunk-size choices. So rsync considerations do not apply to bittorrent.

Why Motorola 68k's 32-bit general-purpose registers are divided into data registers and address registers?

The 68k registers are divided into two groups of eight. Eight data registers (D0 to D7) and eight address registers (A0 to A7). What is the purpose of this separation, would not be better if united?
The short answer is, this separation comes from the architecture limitations and design decisions made at the time.
The long answer:
The M68K implements quite a lot of addressing modes (especially when compared with the RISC-based processors), with many of its instructions supporting most (if not all) of them. This gives a large variety of addressing modes combinations within every instruction.
This also adds a complexity in terms of opcode execution. Take the following example:
move.l $10(pc), -$20(a0,d0.l)
The instruction is just to copy a long-word from one location to another, simple enough. But in order to actually perform the operation, the processor needs to figure out the actual (raw) memory addresses to work with for both source and destination operands. This process, in which operands addressing modes are decoded (resolved), is called the effective address calculation.
For this example:
In order to calculate the source effective address - $10(pc),
the processor loads the value of PC (program) counter register
and adds $10 to it.
In order to calculate the destination effective address -
-$20(a0,d0.l), the processor loads the value of A0 register, adds the value of D0 register to it, then subtracts
$20.
This is quite a lot of calculations of a single opcode, isn't it?
But the M68K is quite fast in performing these calculations. In order to calculate effective addresses quickly, it implements a dedicated Address Unit (AU).
As a general rule, operations on data registers are handled by the ALU (Arithmetic Logical Unit) and operations involving address calculations are handled by the AU (Address Unit).
The AU is well optimized for 32-bit address operations: it performs 32-bit subtraction/addition within one bus cycle (4 CPU ticks), which ALU doesn't (it takes 2 bus cycles for 32-bit operations).
However, the AU is limited to just load and basic addition/subtraction operations (as dictated by the addressing modes), and it's not connected to the CCR (Conditional Codes Register), which is why operations on address registers never update flags.
That said, the AU should've been there to optimize calculation of complex addressing modes, but it just couldn't replace the ALU completely (after all, there were only about 68K transistors in the M68K), hence there are two registers set (data and address registers) each having their own dedicated unit.
So this is just based on a quick lookup, but using 16 registers is obviously easier to program. The problem could be that you would then have to make instructions for each of the 16 registers. Which would double the number of opcodes needed. Using half for each purpose is not ideal but gives access to more registers in general.

What's an elegant/scalable way to dispatch hash queries, whose key is a string, to multiple machines?

I want to make it scalable. Suppose letters are all in lower case. For example, if I only have two machines, queries whose first character is within a ~ m can be dispatched to the first machine, while the n ~ z queries can be dispatched to the second machine.
However, when the third machine comes, to make the queries spread as even as possible, I have to re-calculate the rules and re-distribute the contents stored in the previous two machines. I feel it could be messy. For example, the more complex case, when I already have 26 machines, what should I do when the 27th one comes? What do people usually do to achieve the scalability here?
The process of (self-) organizing machines in a DHT to split the load of handling queries to a pool of objects is called Consistent Hashing:
https://en.wikipedia.org/wiki/Consistent_hashing
I don't think there's a definitive answer to your question.
First is the question of balance. The DHT is balanced when:
each node is under similar load? (load balancing is probably what you're after)
each node is responsible for similar amounts of objects? (this is what you seemed to suggest)
(less likely) each node is responsible for similar amount of the addressing space?
I believe your objective is to make sure none of the machines is overloaded. Unless queries to a single object are enough to saturate a single machine, this is unlikely to happen if you rebalance properly.
If one of the machines is under significantly lower load than the other, you can make the less-load machine take over some of the objects of the higher-load machine by shifting their positions in the ring.
Another way of rebalancing is through virtual nodes -- each machine can simulate being k machines. If its load is low, it can increase the amount of virtual nodes (and take over more more objects). If its load is high, it can remove some of its virtual nodes.

What are the tradeoffs when generating unique sequence numbers in a distributed and concurrent environment?

I am curious about the contraints and tradeoffs for generating unique sequence numbers in a distributed and concurrent environment.
Imagine this: I have a system where all it does is give back an unique sequence number every time you ask it. Here is an ideal spec for such a system (constraints):
Stay up under high-load.
Allow as many concurrent connections as possible.
Distributed: spread load across multiple machines.
Performance: run as fast as possible and have as much throughput as possible.
Correctness: numbers generated must:
not repeat.
be unique per request (must have a way break ties if any two request happens at the exact same time).
in (increasing) sequential order.
have no gaps between requests: 1,2,3,4... (effectively a counter for total # requests)
Fault tolerant: if one or more, or all machines went down, it could resume to the state before failure.
Obviously, this is an idealized spec and not all constraints can be satisfied fully. See CAP Theorem. However, I would love to hear your analysis on various relaxation of the constraints. What type of problems will we left with and what algorithms would we use to solve the remaining problems. For example, if we rid of the counter constraint, then the problem becomes much easier: since gaps are allowed, we can just partition the numeric ranges and map them onto different machines.
Any references (papers, books, code) are welcome. I'd also like to keep a list of existing software (open source or not).
Software:
Snowflake: a network service for generating unique ID numbers at high scale with some simple guarantees.
keyspace: a publicly accessible, unique 128-bit ID generator, whose IDs can be used for any purpose
RFC-4122 implementations exist in many languages. The RFC spec is probably a really good base, as it prevents the need for any inter-system coordination, the UUIDs are 128-bit, and when using IDs from software implementing certain versions of the spec, they include a time code portion that makes sorting possible, etc.
If you must be sequential (per machine) but can drop the gap/counter requirments look for an implementation of the Version 1 UUID as specified in RFC 4122.
If you're working in .NET and can eliminate the sequential and gap/counter requirements, just use System.Guids. They implement RFC 4122 Version 4 and are already unique (very low collision probability) across machines and requests. This could be easily implemented as a web service or just used locally.
Here's a high-level idea for an approach that may fulfill all the requirements, albeit with a significant caveat that may not match many use cases.
If you can tolerate having two sequence numbers - a logical one returned immediately; guaranteed unique and ordered but with gaps - and a separate physical one guaranteed to be in sequential order with no gaps and available a short while later - then the solution seems straightforward:
One distributed system that can serve up a high resolution clock + machine id as the logical sequence number
Stream all the logical sequence numbers into a separate distributed system that orders the logical sequence numbers and maps them to the physical sequence numbers.
The mapping from logical to physical can happen on-demand as soon as the second system is done with processing.

Cell (cell-id), BTS and BSS in GSM network

what is the relation between BTS and cell? I think one BTS hardware can cover few cells and also some cells could be covered by more than one BTS isn't it?
Is part of information, that mobile receives from GSM network identification of concrete BTS or mobile phone knows only cell-id?
Is part of information, that mobile receives from GSM network identification of BSC?
Ad 1: Typically one BTS can handle several cells. Common patterns are a one BTS covering a circular area with one round-radiating antenna or a three-sector BTS which covers three cells with sector-radiating antennas. One cell can only be handed by one BTS at a time. Two or more BTSes are not possible since the radio communication would interfere with each other. Note that this is completely different in WCDMA/UMTS since there is no concept of cells.
Ad 2: Since one cell is covered by exactly one BTS, the cell id uniquely identified the concrete BTS.
Ad 3: Since the BTS does not contain any control logic, the mobile communicates directly with the BSC, e.g. about radio resources.
Edit after comment:
1/ The BTS is "dumb" to say it simply. It does only what the BSC instructs it to do. E.g. The BSC tells the BTS as well as the mobile which frequencies to use for the radio communication. A BTS does not route traffic as it is hooked to exactly one BSC. It even does not route traffic to one of several mobiles attached to the BTS as this is done by the BSC. Think of the BTS as a Um-to-Abis physical layer and protocol transcoder.
2/ Actually my earlier statement that UMTS has no cell concept is not exactly true, it's just different.
GSM is FTDMA (frequency and time division multiple access). The radio channel is shared by using different frequecies (per cell) and timeslots (per mobile). Since radio frequency is used to distinguish participants, great care must be taken that not two GSM participants use the same frequency at the same time at the same location. The solution to this is cells, where geographic areas have different frequencies assigned. Network planning must ensure that no two neighbouring cells use the same frequencies as this may lead to interference since you cannot control exactly the size of a cell (e.g. due to absorption and reflection). In GSM, a BTS has a fixed number of radio transmission channels, the number depends on the BTS hardware configuration. If all channels are in use, the cell is full, this is indpendent of the location of a mobile in the cell.
UMTS is CDMA (code division multiple access). The radio channel is shared by encoding the payload in a way that allows to decode it later even if several senders use the same frequency range. That requires coding schemes which are collision free (all codes are different from each other to avoid senders using too similar codes) and a great deal of signal processing. As an analogy: on a party you can understand someone accross the room, even if ten people are talking. The more senders communicate within the cell, the smaller the cell gets in order to allow the BTS/Node-B distinguishing between senders. Therefore, in UMTS a cell size is not geographically fixed. The cell "breathes" depending on its load.
OK, this thread is quite old, but requires some further clarifications for next generations.
When talking about GSM physical network architecture, the term BTS (Base Transceiver System) refers to the physical site itself - the 'small house with the tower' (although modern small BTSs are just boxes hanged on walls or placed on roof tops).
Each such physical site can host one omni-directional cell, or several sector cells.
In GSM logical network architecture, there is some confusion.
The terms 'Cell' and 'Base Station' actually refer to the same physical entity (a set of transceiver units, each used to receive and transmit one of the paired UL/DL carrier frequencies allocated in the BA frequency set). Let's call this entity 'physical cell' just for clarification.
The term Base Station is used for radio resource management. A BSIC (BS Id Code, or BTS Id Code) is allocated for the 'physical cell' and is used in the radio-related conversations between the MS (Mobile Station) and the BSS (BTS and BSC), e.g. for measurement reports.
The BSIC is composed of 'local' parameters - Network Color Code (NCC) and BS Color Code (BCC), and is therefore unknown outside the network.
This is where the term Cell comes in:
The term Cell is used for Mobility Management. A Cell Identity (CI) is defined as a refinement of the Routing Area - one RA will include several cells in it.
The Global Cell Identifier (GCI) is composed of network, RA and CI, and is used for handovers inside and outside the network.
It is up to the BSC to convert the BSIC to the Cell Identity (the BSC may convert the BSIC directly to GCI, or the BSC converts to CI, and the MSC will convert it to GCI).
Hope that helps a bit.
BTS means different at different place!
MS, BTS, BSC, when these words appear together, BTS means something between your phone and the MSC.
Sometimes we call a site (a small house and a tower) as a BTS.
In NOKIA gsm equipment,cell is called segment. Every cell has at least one BTS,different BTS has different functions,Eg:BTS1 provide voice service,BTS2 provide EDGE service。
Phone get BCCH(freq)/NCC/BCC to identificate different cells. Decode the information from BCCH to get CI, LAC...etc.

Resources