The Z axis is usually used for depth, so it's going to be perpendicular to whatever your frame of reference (i.e. projection plane) is.
If it's upright in space, like a computer screen, the Z axis will be horizontal. If it's a sheet of paper on a desk, then yes, I suppose it could be argued to be vertical instead.
not consistently. I find there are basically two schools of thought in 3d graphics:
the screen is a graph representing a 3d space:
the x axis is horizontal, the y axis is vertical. depth, going 'into' the screen, then becomes the z axis. mathematicians and programmers tend to like this.
the screen is a camera viewing a 3d space from within itself:
the coordinates to position yourself along a line is one dimensional: x. to position yourself on a plane as in a 2d game, two dimensional: x, y. to position yourself within a volume, three dimensional: x, y, z. humans are kind of inherently planar spatial navigators - it's easy to think about our position in terms of "where on the ground" we are, then adjust for height. 3d artists and level designers tend to like this.