• 6 Posts
  • 169 Comments
Joined 1 year ago
cake
Cake day: May 8th, 2023

help-circle
  • Would you say its unfair to base pricing on any attribute of your customer/customer base?

    A business being in a position to be able to implement differential pricing (at least beyond how they divide up their fixed costs) is a sign that something is unfair. The unfairness is not how they implement differential pricing, but that they can do it at all and still have customers.

    YouTube can implement differential pricing because there is a power imbalance between them and consumers - if the consumers want access to a lot of content provided by people other than YouTube through YouTube, YouTube is in a position to say ‘take it or leave it’ about their prices, and consumers do not have another reasonable choice.

    The reason they have this imbalance of market power and can implement differential pricing is because there are significant barriers to entry to compete with YouTube, preventing the emergence of a field of competitors. If anyone on the Internet could easily spin up a clone of YouTube, and charge lower prices for the equivalent service, competitors would pop up and undercut YouTube on pricing.

    The biggest barrier is network effects - YouTube has the most users because they have the most content. They have the most content because people only upload it to them because they have the most users. So this becomes a cycle that helps YouTube and hinders competitors.

    This is a classic case where regulators should step in. Imagine if large video providers were required to federated uploaded content on ActivityPub, and anyone could set up their own YouTube competitor with all the content. The price of the cheapest YouTube clones (which would have all the same content as YouTube) would quickly drop, and no one would have a reason to use YouTube.


  • would not be surprised if regional pricing is pretty much just above the break even mark

    And in the efficient market, that’s how much the service would cost for everyone, because otherwise I could just go to a competitor of YouTube for less, and YouTube would have to lower their pricing to get customers, and so on until no one can lose their prices without losing money.

    Unfortunately, efficient markets are just a neoliberal fantasy. In real life, there are network effects - YouTube has people uploading videos to it because it has the most viewers, and it has the most viewers because it has the most videos. It’s practically impossible for anyone to compete with them effectively because of this, and this is why they can put their prices in some regions up to get more profit. The proper solution is for regulators to step in and require things like data portability (e.g. requiring monopolists to publish videos they receive over open standards like ActivityPub), but regulatory capture makes that unlikely. In a just world, this would happen and their pricing would be close to the costs of running the platform.

    So the people paying higher regional prices are paying money in a just world they shouldn’t have to pay, while those using VPNs to pay less are paying an amount closer to what it should be in a just world. That makes the VPN users people mitigating Google’s abuse, not abusers.


  • Yes, but for companies like Google, the vast majority of systems administration and SRE work is done over the Internet from wherever staff are, not by someone locally (excluding things like physical rack installation or pulling fibre, which is a minority of total effort). And generally the costs of bandwidth and installing hardware is higher in places with a smaller tech industry. For example, when Google on-sells their compute services through GCP (which are likely proportional to costs) they charge about 20% more for an n1-highcpu-2 instance in Mumbai than in Oregon, US.


  • that’s abuse of regional pricing

    More like regional pricing is an attempt to maximise value extraction from consumers to best exploit their near monopoly. The abuse is by Google, and savvy consumers are working around the abuse, and then getting hit by more abuse from Google.

    Regional pricing is done as a way to create differential pricing - all businesses dream of extracting more money from wealthy customers, while still being able to make a profit on less wealthy ones rather than driving them away with high prices. They find various ways to differentiate between wealthy and less wealthy (for example, if you come from a country with a higher average income, if you are using a User-Agent or fingerprint as coming from an expensive phone, and so on), and charge the wealthy more.

    However, you can be assured that they are charging the people they’ve identified as less wealthy (e.g. in a low average income region) more than their marginal cost. Since YouTube is primarily going to be driven by marginal rather than fixed costs (it is very bandwidth and server heavy), and there is no reason to expect users in high-income locations cost YouTube more, it is a safe assumption that the gap between the regional prices is all extra profit.

    High profits are a result of lack of competition - in a competitive market, they wouldn’t exist.

    So all this comes full circle to Google exploiting a non-competitive market.


  • they have ran out of VC money

    You know YouTube is owned by Google, not VC firms right?

    Big companies sometimes keep a division / subsidiary less profitable for a time for a strategic reason, and then tighten the screws.

    They generally only do this if they believe it will eventually be profitable over the long term (or support another part of the strategy so it is profitable overall). Otherwise they would have sold / shut it down earlier - the plan is always going to be to profitable.

    However, while an unprofitable business always means either a plan to tighten screws, or to sell it / shut it down, tightening screws doesn’t mean it is unprofitable. They always want to be more profitable, even if they already are.



  • When people say Local AI, they mean things like the Free / Open Source Ollama (https://github.com/ollama/ollama/), which you can read the source code for and check it doesn’t have anything to phone home, and you can completely control when and if you upgrade it. If you don’t like something in the code base, you can also fork it and start your own version. The actual models (e.g. Mistral is a popular one) used with Ollama are commonly represented in GGML format, which doesn’t even carry executable code - only massive multi-dimensional arrays of numbers (tensors) that represent the parameters of the LLM.

    Now not trusting that the output is correct is reasonable. But in terms of trusting the software not to spy on you when it is FOSS, it would be no different to whether you trust other FOSS software not to spy on you (e.g. the Linux kernel, etc…). Now that is a risk to an extent if there is an xz style attack on a code base, but I don’t think the risks are materially different for ‘AI’ compared to any other software.


  • They don’t have any leverage, because the people calling the shots in Israel (and to be clear, that is the likes of Ben-Gvir and Smotrich, who want effectively no Arabs river to sea, and hence Netanyahu, who I think would do just about any atrocity no matter how abhorrent just to stay in power and out of jail) value the pretext to invade far more than they value the lives of the hostages.

    So the hostages do not actually give Hamas any leverage over Israel - hence why Israel is not willing to agree to anything. Hamas should not have taken civilians hostage or targeted civilians in the first place, and they should release them. That is still an ongoing war crime, even if it is overshadowed by bigger ones being perpetrated by the Israeli side.

    Hamas never had a chance of winning on military might.

    The best chance for a good outcome for the Palestinian people is through raising awareness of the plight of the Palestinians, resulting in international pressure. The pressure against Israel arising now is because of the severity of Israel’s war crimes, while Hamas’ war crimes are one of the key talking points used to justify not taking action. Hamas could help Palestine win the information space war by taking the high road; winning a military war is futile for them.

    While it is not fair to punish Palestinian civilians for the war crimes of Hamas just because the interests of Palestinian civilians are aligned to Hamas’ goals, there are many people who don’t see it that way. Palestinian statehood (or a non-apartheid one-state solution) would now get far more international support if the Palestinian militants shifted to peaceful resistance.


  • Blockchain is great for when you need global consensus on the ordering of events (e.g. Alice gave all her 5 ETH to Bob first, so a later transaction to give 5 ETH to Charlie is invalid). It is an unnecessarily expensive solution just for archival, since it necessitates storing the data on every node forever.

    Ethereum charges ‘gas’ fees per transaction which helps ensure it doesn’t collapse under the weight of excess usage. Blocks have transaction limits, and transactions have size limits. It is currently working out at about US$7,500 per MB of block data (which is stored forever, and replicated to every node in the network). The Internet Archive have apparently ~50 PB of data, which would cost US$371 trillion to put onto Ethereum (in practice, attempting this would push up the price of ETH further, and if they succeeded, most nodes would not be able to keep up with the network). Really, this is just telling us that blockchain is not appropriate for that use case, and the designers of real world blockchains have created mechanisms to make it financially unviable to attempt at that scale, because it would effectively destroy the ability to operate nodes.

    The only real reason to use an existing blockchain anyway would be on the theory that you could argue it is too big to fail due to legitimate business use cases, and too hard to remove censorship resistant data. However, if it became used in the majority for censorship resistant data sharing, and transactions were the minority, I doubt that this would stop authorities going after node operators and so on.

    The real problems that an archival project faces are:

    • The cost of storing and retrieving large amounts of data. That could be decentralised using a solution where not all data is stored on a chain - for example, IPFS.
    • The problem of curating data and deciding what is worth archiving, and what is a true-to-source archive vs fake copy. This probably requires either a centralised trusted party, or maybe a voting system.
    • The problem of censorship. Anonymity and opaqueness about what is on a particular node can help - but they might in some cases undermine the other goals of archival.

  • A1kmm@lemmy.amxl.comtoPrivacy@lemmy.mlInternet Archive is in danger
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    2
    ·
    16 days ago

    This is absolutely because they pulled the emergency library stunt, and they were loud as hell about it. They literally broke the law and shouted about it.

    I think that you are right as to why the publishers picked them specifically to go after in the first place. I don’t think they should have done the “emergency library”.

    That said, the publishers arguments show they have an anti-library agenda that goes beyond just the emergency library.

    Libraries are allowed to scan/digitize books they own physically. They are only allowed to lend out as many as they physically own though. Archive knew this and allowed infinite “lend outs”. They even openly acknowledged that this was against the law in their announcement post when they did this.

    The trouble is that the publishers are not just going after them for infinite lend-outs. The publishers are arguing that they shouldn’t be allowed to lend out any digital copies of a book they’ve scanned from a physical copy, even if they lock away the corresponding numbers of physical copies.

    Worse, they got a court to agree with them on that, which is where the appeal comes in.

    The publishers want it to be that physical copies can only be lent out as physical copies, and for digital copies the libraries have to purchase a subscription for a set number of library patrons and concurrent borrows, specifically for digital lending, and with a finite life. This is all about growing publisher revenue. The publishers are not stopping at saying the number of digital copies lent must be less than or equal to the number of physical copies, and are going after archive.org for their entire digital library programme.


  • A1kmm@lemmy.amxl.comtoAsklemmy@lemmy.mlAre you a 'tankie'
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    3
    ·
    17 days ago

    No

    On economic policy I am quite far left - I support a low Gini coefficient, achieved through a mixed economy, but with state provided options (with no ‘think of the businesses’ pricing strategy) for the essentials and state owned options for natural monopolies / utilities / media.

    But on social policy, I support social liberties and democracy. I believe the government should intervene, with force if needed, to protect the rights of others from interference by others (including rights to bodily safety and autonomy, not to be discriminated against, the right to a clean and healthy environment, and the right not to be exploited or misled by profiteers) and to redistribute wealth from those with a surplus to those in need / to fund the legitimate functions of the state. Outside of that, people should have social and political liberties.

    I consider being a ‘tankie’ to require both the leftist aspect (✅) and the authoritarian aspect (❌), so I don’t meet the definition.



  • I think any prediction based on a ‘singularity’ neglects to consider the physical limitations, and just how long the journey towards significant amounts of AGI would be.

    The human brain has an estimated 100 trillion neuronal connections - so probably a good order of magnitude estimation for the parameter count of an AGI model.

    If we consider a current GPU, e.g. the 12 GB GFX 3060, it can hold about 24 billion parameters at 4 bit quantisation (in reality a fair few less), and uses 180 W of power. So that means an AGI might use 750 kW of power to operate. A super-intelligent machine might use more. That is a farm of 2500 300W solar panels, while the sun is shining, just for the equivalent of one person.

    Now to pose a real threat against the billions of humans, you’d need more than one person’s worth of intelligence. Maybe an army equivalent to 1,000 people, powered by 8,333,333 GPUs and 2,500,000 solar panels.

    That is not going to materialise out of the air too quickly.

    In practice, as we get closer to an AGI or ASI, there will be multiple separate deployments of similar sizes (within an order of magnitude), and they won’t be aligned to each other - some systems will be adversaries of any system executing a plan to destroy humanity, and will be aligned to protect against harm (AI technologies are already widely used for threat analysis). So you’d have a bunch of malicious systems, and a bunch of defender systems, going head to head.

    The real AI risks, which I think many of the people ranting about singularities want to obscure, are:

    • An oligopoly of companies get dominance over the AI space, and perpetuates a ‘rich get richer’ cycle, accumulating wealth and power to the detriment of society. OpenAI, Microsoft, Google and AWS are probably all battling for that. Open models is the way to battle that.
    • People can no longer trust their eyes when it comes to media; existing problems of fake news, deepfakes, and so on become so severe that they undermine any sense of truth. That might fundamentally shift society, but I think we’ll adjust.
    • Doing bad stuff becomes easier. That might be scamming, but at the more extreme end it might be designing weapons of mass destruction. On the positive side, AI can help defenders too.
    • Poor quality AI might be relied on to make decisions that affect people’s lives. Best handled through the same regulatory approaches that prevent companies and governments doing the same with simple flow charts / scripts.

  • A1kmm@lemmy.amxl.comtocats@lemmy.worldA cat entered my tent
    link
    fedilink
    English
    arrow-up
    2
    ·
    21 days ago

    I’m looking into it using data from my instance to check it isn’t an abuse issue.

    What I know so far:

    1. It is a lemmy.world user.
    2. That user has downvoted 548 comments, and upvoted 18. Downvoted 557 posts and upvoted 25.
    3. Timing: the downvoting has been going on for some time, it isn’t a new thing. 71 downvoted comments since 2024-06-01T00:00:00Z, 212 since the start of May (out of 548).
    4. The user has two comments ever, and no posts. One comment, on a thread about the actions of a right-wing American politician, said “Click bait lemmy for sure”. This could imply the downvotes are legitimate and coming from having an impossibly high standard for what is considered quality here, or perhaps they are related to political grudges. I’m going to look further for patterns in the downvotes. I think a bot could have done far more downvotes - so it could just be a human.

  • I looked into this previously, and found that there is a major problem for most users in the Terms of Service at https://codeium.com/terms-of-service-individual.

    Their agreement talks about “Autocomplete User Content” as meaning the context (i.e. the code you write, when you are using it to auto-complete, that the client sends to them) - so it is implied that this counts as “User Content”.

    Then they have terms saying you licence them all your user content:

    “By Posting User Content to or via the Service, you grant Exafunction a worldwide, non-exclusive, irrevocable, royalty-free, fully paid right and license (with the right to sublicense through multiple tiers) to host, store, reproduce, modify for the purpose of formatting for display and transfer User Content, as authorized in these Terms, in each instance whether now known or hereafter developed. You agree to pay all monies owing to any person or entity resulting from Posting your User Content and from Exafunction’s exercise of the license set forth in this Section.”

    So in other words, let’s say you write a 1000 line piece of software, and release it under the GPL. Then you decide to trial Codeium, and autocomplete a few tiny things, sending your 1000 lines of code as context.

    Then next week, a big corp wants to use your software in their closed source product, and don’t want to comply with the GPL. Exafunction can sell them a licence (“sublicence through multiple tiers”) to allow them to use the software you wrote without complying with the GPL. If it turns out that you used some GPLd code in your codebase (as the GPL allows), and the other developer sues Exafunction for violating the GPL, you have to pay any money owing.

    I emailed them about this back in December, and they didn’t respond or change their terms - so they are aware that their terms allow this interpretation.



  • Votes on this comment:

    1. Came from 14 different instances - many of them major. Of those instances, the instance with the most votes contributed was lemmy.world (i.e. your own instance), from which my instance has seen 14 votes for that comment.
    2. Of the voters, I looked at the distribution of the person IDs assigned on my instance, which approximately represents the order they were seen by my instance (e.g. they voted on or interacted with another comment). If there was vote manipulation, I’d expect to see lots of IDs close together. However, there are not runs of IDs that are close together. To avoid this when manipulating votes, they’d need to have planned in advance, and made accounts and used them individually over time before finally deploying them to downvote you.

    If there are instances that are a significant source of vote manipulation, and the local admins are unwilling to address it, there are options available to instance admins like defederation.

    However - in the case of your comments, there is no meaningful evidence of vote manipulation.


  • The best option is to run them models locally. You’ll need a good enough GPU - I have an RTX 3060 with 12 GB of VRAM, which is enough to do a lot of local AI work.

    I use Ollama, and my favourite model to use with it is Mistral-7b-Instruct. It’s a 7 billion parameter model optimised for instruction following, but usable with 4 bit quantisation, so the model takes about 4 GB of storage.

    You can run it from the command line rather than a web interface - run the container for the server, and then something like docker exec -it ollama ollama run mistral, giving a command line interface. The model performs pretty well; not quite as well on some tasks as GPT-4, but also not brain-damaged from attempts to censor it.

    By default it keeps a local history, but you can turn that off.


  • Cars definitely kill wildlife too - estimation methodologies vary, but I’ve seen estimates saying:

    • Vehicles directly kill about 10,000,000 native animals across Australia per annum. That’s not including habitat loss, and doesn’t include insects (birds, reptiles, and mammals only).
    • Pet cats kill about 546,000,000 native animals across Australia per annum. I believe that’s using a similar definition excluding insects.
    • Feral cats kill about 3,000,000,000 native animals across Australia per annum.

    Of course, habit destruction and pollution has a huge impact as well.

    But roaming pet cats legitimately are a major part of the problem. It is possible to simultaneously replace lawns with tree cover, and reduce the burden of cats. That could also feed into a comprehensive policy of tackling stray and feral cat populations - something which is made harder in suburbs due to roaming pet cats.

    As for whether it is cruel: change is a stressor for cats, so a sudden change from outdoor access to indoor-only could increase stress levels, but that is a one-off transition and there could be ways to manage that (for example, by providing a lot of notice of a change and allowing owners to phase out access, or by having a permit system for indoor and outdoor cats, and allowing renewal of existing permits for specific microchipped cats, but no new outdoor cat permits). Outdoor access / hunting outdoors is a form of enrichment for cats, but not the only one possible. Indoor cats can play with toys, and have owners simulate chasing and hunting activities indoors (for example, with ribbons, small balls, chasing cat treats, and so on) to provide similar enrichment. At the same time, the indoors protect cats from stressful situations like encountering or being mauled by dogs, aggressive cats, foxes, brushtail possums, injuries on the roads, and disease.


  • I think the most striking thing is that for outsiders (i.e. non repo members) the acceptance rates for gendered are lower by a large and significant amount compared to non-gendered, regardless of the gender on Google+.

    The definition of gendered basically means including the name or photo. In other words, putting your name and/or photo as your GitHub username is significantly correlated with decreased chances of a PR being merged as an outsider.

    I suspect this definition of gendered also correlates heavily with other forms of discrimination. For example, name or photo likely also reveals ethnicity or skin colour in many cases. So an alternative hypothesis is that there is racism at play in deciding which PRs people, on average, accept. This would be a significant confounding factor with gender if the gender split of Open Source contributors is different by skin colour or ethnicity (which is plausible if there are different gender roles in different nations, and obviously different percentages of skin colour / ethnicity in different nations).

    To really prove this is a gender effect they could do an experiment: assign participants to submit PRs either as a gendered or non-gendered profile, and measure the results. If that is too hard, an alternative for future research might be to at least try harder to compensate for confounding effects.